ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

adaptive adam learning rate #252

Open a-z-e-r-i-l-a opened 3 years ago

a-z-e-r-i-l-a commented 3 years ago

I noticed that in your implementations the base learning rate of adam is decayed:

https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/84a7582477fb0d5c82ad6d850fe476829dddd2e1/main.py#L109

Wanted to ask if you know of some papers regarding this approach for adapting the learning rate of Adam, for example in your case linearly.