Closed MrOCW closed 2 years ago
Hi @MrOCW
Can you share your entire yaml file? It looks like the discriminator in GAIL is breaking but it's not clear why just from these curves. I suspect a NaN or something like that because GAIL appears to be working properly until that point around ~500k timesteps.
Are you seeing other large time intervals between summaries or is the screenshot the only time this occurs? Does this screenshot coincide with the degradation of GAIL?
Can you share the other TB curves e.g. policy entropy? Are you seeing any NaNs either in C# or python?
@andrewcoh YAML
trainer_type: sac
hyperparameters:
learning_rate: 0.0003
learning_rate_schedule: constant
batch_size: 256
buffer_size: 58000
buffer_init_steps: 10000
tau: 0.005
steps_per_update: 4.0
save_replay_buffer: False
init_entcoef: 0.5
reward_signal_steps_per_update: 10.0
network_settings:
normalize: False
hidden_units: 256
num_layers: 1
vis_encode_type: simple
memory: None
goal_conditioning_type: none
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
gail:
gamma: 0.99
strength: 0.5
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: none
learning_rate: 0.0003
encoding_size: None
use_actions: True
use_vail: False
demo_path: ..................../Assets/Demonstrations
init_path: None
keep_checkpoints: 5
checkpoint_interval: 200000
max_steps: 3000000
time_horizon: 64
summary_freq: 500
threaded: True
self_play: None
behavioral_cloning: None
For the time intervals, yes, it happens throughout training. To elaborate, printing of summary steps happens in batches after large time intervals. It does not seem to have any relation with the GAIL issue
Have not seen any NaNs anywhere.
@andrewcoh any updates on this issue?
This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Hi,
I'm trying to train a car to drive in lane using GAIL but it seems that GAIL is disabled (?) after awhile.
Also, for SAC, the environment freezes for a long time (presumably for updating the network) and many summary steps get shown in batches If i increase the batch size from 256 to 512, the environment freezes for as long as 5min
Environment (please complete the following information):