Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
MIT License
1.09k stars 186 forks source link

What's Conjugate gradients and line_search in TROP? #30

Closed Dreamlikec closed 3 years ago

Dreamlikec commented 3 years ago

Could you please give me a sense/reference what these two func meaing for?