What's Conjugate gradients and line_search in TRPO?

Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

MIT License

1.09k stars 186 forks source link

What's Conjugate gradients and line_search in TRPO? #31

Open Dreamlikec opened 3 years ago

Dreamlikec commented 3 years ago

Could you please give me a sense/reference what these two func meaing for?

0xJchen commented 2 years ago

chech here: https://spinningup.openai.com/en/latest/algorithms/trpo.html