kakaoenterprise / JORLDY

Repository for Open Source Reinforcement Learning Framework JORLDY
Apache License 2.0
359 stars 50 forks source link

Initial Actor loss issue when runninf continuous PPO #157

Closed kan-s0 closed 2 years ago

kan-s0 commented 2 years ago

Describe the bug The initial actor loss is too large when learning in a continuous Mujoco environment.

To Reproduce python main.py --config config.ppo.mujoco --env.name half_cheetah --agent.n_step 2048 --train.num_workers 8

Expected behavior Very large or Nan ratio (actor loss) occurs.

스크린샷 2022-04-04 오후 6 02 17 스크린샷 2022-04-05 오전 10 06 17 스크린샷 2022-04-05 오전 10 07 37

Development Env. (OS, version, libraries): linux, V2XLARGE, jorldy:0.3.0

Additional context Add any other context about the problem here.