Closed mtaohuang closed 4 years ago
The scripts under test
are updated. I previously found that the logstd in Gaussian policy should not be conditioned on the input, otherwise it would cause unstable trainings. Maybe you can have a try first?
The scripts under examples
will be maintained after the NeurIPS deadline :)
I have met the same problem when I was training in the myself env which was stable in other method.
I have met the same problem when I was training in myself env which was stable in other methods.
Have you tried the current github version?
Sorry I had updated the version and finded the problem have solved just now. Thank you.
python3 examples/halfcheetahBullet_v0_sac.py --task BipedalWalkerHardcore-v3
, cannot pass nan assertion, and causing env exception.