Evaluation results are always 0% success rates.

facebookresearch / r3m

Pre-training Reusable Representations for Robotic Manipulation Using Diverse Human Video Data

https://sites.google.com/view/robot-r3m/

MIT License

292 stars 45 forks source link

Evaluation results are always 0% success rates. #16

Closed jasonseu closed 2 years ago

jasonseu commented 2 years ago

Hi, thanks for your great work. I have configured all the environment depended by Franka Kitchen and Adroit about a month ago, and training results, and the training results look similar to the paper. However, recently I reconfigured the environment and ran the training command. The losses decreased normally, but the success rates are always 0.0%, whether on the kitchen_sdoor_open-v3 task or the pen-v0 task. I have tried my best to figure out this problem, but finally failed. Is this related to the recent updates to the mj_evns repository? Hope for help.

vikashplus commented 2 years ago

Can you check you are using the correct branch stable of mj_envs
If you are on the correct branch, you can try rolling it back a few commits to see when it worked for you. There are 3 small bug-fixes which largely are no-ops. It's likely we missed something.

@suraj-nair-1 -- can you try this at our end as well.

suraj-nair-1 commented 2 years ago

Looking at the commits my guess is the env updates maybe impacted something in the visuals/physics, which would produce a distribution shift between the R3M demo data and evaluation env. But not sure yet, I'll take a closer look and try it out in a bit.

vikashplus commented 2 years ago

I did some digging --

This is a renaming operation : https://github.com/vikashplus/mj_envs/commit/c2887e2e2f22ede5de1fd8eed5b69e05bb64c257 which was causing issues in in some operating systems
This is handling the corner case when values are 0 : https://github.com/vikashplus/mj_envs/commit/dfbc3c19d61e1dc24f246a0532a17634aa1e9b73
A version bump happened here on one of the submodule https://github.com/vikashplus/mj_envs/commit/c58ba644271f43efcb38e5d308913818ee7e9ba7 -- so my suspicion will be here. It indeed makes a difference in the hue. And if the demos are pre-recorded, this will cause an issue.
But this wont explain why pen-v0 task faced issues as pen has no dependency on this submodule.

@suraj-nair-1: I rolled it back on a new test brach https://github.com/vikashplus/mj_envs/tree/stable-bugtest to help with testing. Let's confirm that rolling back this version bump fixes things and then we will revert back the stable branch as well.

suraj-nair-1 commented 2 years ago

Yeah I just confirmed, things were working fine for me on this commit. When I re-cloned and installed the latest version of stable (and latest sub-modules), it no longer works and I indeed get the kitchen with completely different visuals. The demos are pre-collected so this will cause issues.

So I think reverting back to 8c8b68c9a77981ca3bd3ff13f114f13a783ce2bf should resolve this.

suraj-nair-1 commented 2 years ago

Also tested the new stable-bugtest branch and it behaves as expected.

vikashplus commented 2 years ago

@suraj-nair-1 -- both kitchen as well as the pen-v0 tasks? Once I get a thumbs up, I'll merge stable-bugtest with stable.

suraj-nair-1 commented 2 years ago

Just tested, I also get the expected performance for pen-v0. I think you should be good to merge.

vikashplus commented 2 years ago

Fixed in 8d125b5dbf48118cf34915ec2351608db469882b