Closed rhofour closed 6 years ago
Great to have an open-source contribution of this! I don't think any of the instructors will have time to code it up ourselves. The original expert policies were generated using TRPO. An implementation of TRPO is available in the rllab repo. (https://github.com/openai/rllab)
Implementations of TRPO in PyTorch,
continuous - https://github.com/ikostrikov/pytorch-trpo
discrete - https://github.com/mjacar/pytorch-trpo
homework3 (DQN) - https://github.com/transedward/pytorch-dqn
I've got it working with the provided agents from roboschool. However, I'm waiting on them to add a LICENSE so I can confirm I can actually grab their agents. Once that happens I'll send a pull request.
OpenAI has re-implemented the Mujuco environments in Bullet making them available to everyone. It would be really helpful for people following the course after the fact if HW1 could be ported over to use these new environments.
I could probably send some time working on this, but I'm not sure how I would generate new expert policies.