Closed kan-s0 closed 2 years ago
Describe the bug The initial actor loss is too large when learning in a continuous Mujoco environment.
To Reproduce python main.py --config config.ppo.mujoco --env.name half_cheetah --agent.n_step 2048 --train.num_workers 8
Expected behavior Very large or Nan ratio (actor loss) occurs.
Development Env. (OS, version, libraries): linux, V2XLARGE, jorldy:0.3.0
Additional context Add any other context about the problem here.
Describe the bug The initial actor loss is too large when learning in a continuous Mujoco environment.
To Reproduce python main.py --config config.ppo.mujoco --env.name half_cheetah --agent.n_step 2048 --train.num_workers 8
Expected behavior Very large or Nan ratio (actor loss) occurs.
Development Env. (OS, version, libraries): linux, V2XLARGE, jorldy:0.3.0
Additional context Add any other context about the problem here.