ShangtongZhang / DeepRL

Modularized Implementation of Deep RL Algorithms in PyTorch
MIT License
3.21k stars 684 forks source link

Continuous Control Reward and State Normalization #106

Closed xkianteb closed 3 years ago

xkianteb commented 3 years ago

I can not seem to figure out if you are normalizing the reward and state space for continuous control problems.

pytorch-a2c-ppo-acktr does it -- (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/1951751a03b78307bb60ba542019756ebcb5200c/a2c_ppo_acktr/envs.py#L99)

stablebasleine does it -- (https://github.com/araffin/rl-baselines-zoo/blob/ff84f398a1fae65e18819490bb4e41a201322759/hyperparams/a2c.yml#L54)

and Deep Reinforcement Learning that Matters (https://arxiv.org/pdf/1709.06560.pdf) says it helps.