jingweiz / pytorch-rl

Deep Reinforcement Learning with pytorch & visdom
MIT License
798 stars 143 forks source link

Added opensim-rl environment and a sample configuration and options for a continuous DQN agent to learn in that environment #10

Closed praveen-palanisamy closed 7 years ago

praveen-palanisamy commented 7 years ago

opensim-rl Is an environment introduced by the NIPS 2017 Learning to run challenge. In this environment, an agent is tasked with learning how to run while avoiding obstacles on the ground. The environment provides a human musculoskeletal model and a physics-based simulation environment which are pretty good. This environment will be useful for training agents that can handle much more complex control tasks even after the NIPS challenge ends. Can be seen as a good alternative or as a complementary environment to Mujoco based environments.

Contributions:

jingweiz commented 7 years ago

Great! Thanks a lot for the contribution! Sorry just saw :P

jingweiz commented 7 years ago

Hey, I wasn't looking carefully yesterday when I merged your pr. I now have one confusion, you didn't add any "dqn-mlp-con" model in the core/models/ right? Then the factory would give an error when it tries to parse this option, can you double check it? Thanks a lot!

praveen-palanisamy commented 7 years ago

The dqn-mlp-con option uses the DQNMlpModel class defined in core/models/dqn_mlp.py and pointed to the factory via the ModelDict in this line in the factory.py file. So, it does not produce any error.

praveen-palanisamy commented 7 years ago

I tried to make sure (again) that the code after the PR runs fine without errors with config option 8 (corresponding to the opensim). The training starts fine with the opensim environment without any issues.

Though, there is a minor change: I had to rename opensim.py to core/ envs/opensim.py. I will update the PR with a new commit for the rename. UPDATE: I am not able to update this PR because of the merge/revert. Should I send in a new PR for the rename? ( opensim.py to core/ envs/opensim.py)

jingweiz commented 7 years ago

Hey, thanks a lot for the effort! I think what I'm a little confused about is that the dqn setup can only work with discrete actions, and all the models that are not ended with "-con" means they are for discrete action spaces. So do you meant to use the discrete version? Or are there any misunderstandings?

jingweiz commented 7 years ago

Hey, I think the pr is fine if you just simply don't add that new config, maybe you can try push again your old pr then I can easily modify it? Thanks! And sorry for the trouble!

praveen-palanisamy commented 7 years ago

The DQN in this repository (without the Normalized Advantage Function) is not a good config for the opensim environment. Sorry I should have realized that before. I will try to add it later. For now, I will do what you suggested: I will submit the PR.