Closed praveen-palanisamy closed 7 years ago
Great! Thanks a lot for the contribution! Sorry just saw :P
Hey,
I wasn't looking carefully yesterday when I merged your pr. I now have one confusion, you didn't add any "dqn-mlp-con" model in the core/models/
right? Then the factory would give an error when it tries to parse this option, can you double check it? Thanks a lot!
The dqn-mlp-con
option uses the DQNMlpModel
class defined in core/models/dqn_mlp.py
and pointed to the factory via the ModelDict
in this line in the factory.py
file.
So, it does not produce any error.
I tried to make sure (again) that the code after the PR runs fine without errors with config option 8 (corresponding to the opensim). The training starts fine with the opensim environment without any issues.
Though, there is a minor change: I had to rename opensim.py
to core/ envs/opensim.py
. I will update the PR with a new commit for the rename. UPDATE: I am not able to update this PR because of the merge/revert. Should I send in a new PR for the rename? ( opensim.py
to core/ envs/opensim.py
)
Hey, thanks a lot for the effort! I think what I'm a little confused about is that the dqn setup can only work with discrete actions, and all the models that are not ended with "-con" means they are for discrete action spaces. So do you meant to use the discrete version? Or are there any misunderstandings?
Hey, I think the pr is fine if you just simply don't add that new config, maybe you can try push again your old pr then I can easily modify it? Thanks! And sorry for the trouble!
The DQN in this repository (without the Normalized Advantage Function) is not a good config for the opensim environment. Sorry I should have realized that before. I will try to add it later. For now, I will do what you suggested: I will submit the PR.
opensim-rl Is an environment introduced by the NIPS 2017 Learning to run challenge. In this environment, an agent is tasked with learning how to run while avoiding obstacles on the ground. The environment provides a human musculoskeletal model and a physics-based simulation environment which are pretty good. This environment will be useful for training agents that can handle much more complex control tasks even after the NIPS challenge ends. Can be seen as a good alternative or as a complementary environment to Mujoco based environments.
Contributions: