Closed hughperkins closed 6 years ago
This repo is a little bit messy.
Do you want to use TRPO specifically?
I highly recommend to use PPO instead: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr
and the repo is much cleaner (and I remember it better).
Awesome. Good info. Thanks! :)
(Note: perhaps you might consider adding a link to the newer repo to the readme of this repo; I guess that for each person who leaves an Issue, there might be 20 who just walk on by, and never find out about the newer repo)
Yes, that's a good idea! I will add a link.
like eg, imagine I have my own policy, that takes in a state, and outputs an action, or perhaps a distribution over actions; and I have a world that takes an action, and returns a reward and a new state, how would I plug these into this TRPO implementation?