Closed gal-leibovich closed 4 years ago
Thanks for bringing up this issue. Can you try again now? I'm running your testing script but I'm getting consistent averaged episode rewards of >1 (even up to 2-3).
The redundantself.sim.reset()
call is there as a placeholder while we debug the mujoco viewer, which is currently preventing us from being able to actually execute a hard reset as intended.
Hi @cremebrule,
Sure, I would be happy to try again but as I wrote on that other issue, it seems that someone has deleted the v1.0 branch? Was this intended?
Hi @gal-leibovich We moved the v1.0 branch to private for access control prior to the public release. We are in active discussions to figure out the best protocol of sharing a pre-release with limited external users, including you. As you have access to a copy of the v1.0 branch, you are more than welcome to submit any further issues that you spot. We really appreciate your help with testing the v1.0 branch. Thanks!
Hi @yukezhu, thanks a lot for being so open and sharing!
As I'm actively working and developing on top of the v1.0 branch, I would be happy to continue to test it out and submit issues and/or PRs to help out with the development, as I really appreciate the great work that you guys are doing.
If possible, I would be very happy to get access to the pre-release repo so to continue getting the latest bug-fixes, testing new commits as these are pushed and submitting PRs/issue where appropriate.
closing this for now -- once we get v1.0 released, feel free to open up another issue if you're still having these regression issues!
Hi,
Lately, I have started looking into the v1.0 branch. Generally it looks very good, and the refactoring being made is making it much easier to work with Robosuite. Also, all the newly added robots are very neat. Kudos! :-)
I did notice a regression in running one of my tests training a simple RL agent on the v1.0 branch, though. Specifically, starting with commit da27201, which is enforcing simulation reset with all the environments here (btw it would have made sense to get it out of the if-else clause, as it is being executed in both cases) - https://github.com/StanfordVL/robosuite/blob/92c398f321c1837eb86601859515a5721a7eb2d9/robosuite/environments/base.py#L158
Now, it's worth mentioning that I am easily training an agent with
master
or with thev1.0
(up until before that commit) or withv1_dr
branches.The regression in rewards can be reproduced by running the following snippet -
Running this snippet before the enforced simulation reset that was added would return episode rewards of ~1, while after the change this would return rewards of ~0.2.