Closed jasonseu closed 2 years ago
Hello, Yeah this is not expected, while there can be variability between checkpoints only 0% or 100% is not what I observed.
First, could you share the exact command you are running? How many demos/what representation/etc.
Also, you followed the installation setup instructions in the ReadMe here, correct? Did you try the "Verifying Correct Installation" commands, and observe the expected numbers listed there (~60% for R3M and ~30% for CLIP)? If you are not getting those numbers then something likely is wrong in installation. For the Franka Kitchen did you make sure to add the line FIXED_ENTRY_POINT = RANDOM_DESK_ENTRY_POINT
in mj_envs to use the Randomized Desk?
Yes, this looks like an environment version mismatch issue. Please follow the Readme instructions exactly like Suraj mentioned to get the correct version of environments for evaluation.
Hi Suraj, thanks for your quick reply. I used the exact same commands as you suggested in the README for verifying correct installation. I follow your suggestions and add the line FIXED_ENTRY_POINT = RANDOM_DESK_ENTRY_POINT
in mj_envs, and now that evaluation results look normal. Is it similar to the results of your experiment?
Yup that looks right to me!
Hi, thanks for your awesome work. I follow the steps in README and successfully train the policy network. The loss decreases to a very low level. However, as shown in figure, the evaluation results show that max success rates are either 0% or 100% on benchmarks like kitchen_light_on-v3, kitchen_ldoor_open-v3 and kitchen_sdoor_open-v3. Is this reasonable? Or do you have any suggestions?