Closed quanvuong closed 5 years ago
Hey, thanks for reporting this! I'm not sure the exact software now tbh, but I will check this out on my end and see if it works. I was also planning on updating some of the code to python 3 soon, and making sure everything still works at that point as well.
Did you try more than 1 seed? DDPG is rather unstable and we took the best X out of Y seeds to use as the expert for some of the experiments in the paper.
Thank you for getting back to me!
Let me try other seed values : ).
In particular, I'm wondering if the MuJoCo version used can be posted here, both for OpenAI's mujoco-py package and the MuJoCo downloaded from Todorov? It looks like using MuJoCo 2.0 causes some issues. https://github.com/openai/gym/issues/1541
Sorry for the late reply. Unfortunately, the exact versions was not something I recorded. However, doing some detective work, the original experiments were run in python 2, meaning the MuJoCo version was likely 1.31 and the mujoco-py version was likely 0.5.7.
I haven't tested whether the MuJoCo versions matter with the current iteration of the code, but nobody has reported any other issues so hopefully not :^)
You're so kind to answer question on research code base over an extended period of time Scott Fujimoto : ).
Thank you for the clean codebase!
I'm trying to reproduce the results in the paper, but I'm unable to obtain an expert with good performance while running the code as is.
I suspect it's because of software version mismatch. Can you please post the software packages configuration that you used to run the code?
Thank you!
The packages in my conda environment can be found below. With these package versions, I was only able to obtain a policy with episode return approximately 230 on Hopper-v1, which is quite low.
conda_packages.pdf