Closed eugeniafeng closed 4 months ago
Hi, may I ask which model you aim to test? Currently, we only support the test of pre-trained reward models in RL training, meaning you need to run RL scripts to test them. You may just download the pre-trained models shown in README and then follow our instructions.
If you aim to test RL models, you may need to write some additional code.
For the Metaworld script, refer to this #1 to ensure you installed the paper-used version of Metwaworld.
Thank you for the response! I have downloaded the pre-trained models in the README. Are the RL scripts to test these models also included in the README?
I have installed the specified version of Metaworld, but the script still seems to be broken. Thank you!
Hi, I think you don't need to run the MW script if you have downloaded the pre-trained models. You may directly run the RL scripts (included in README) to test models. (We provide configuration of Assembly task for your test)
Regarding the failure to run the MW script, I think this may be attributed to the issue of MetaWorld itself, which is out of the range of this repo. But, let us know more details if it still gets stuck after your attempts.
Hi, could you point me to which script in the README tests the models? They all look like training scripts based on the names.
Hi, we train RL to test reward models. So you may run RL scripts in this sense.
If you aim to directly test the reward models (e.g., calculating learned rewards given a trajectory), you may write additional scripts where this function could be recalled.
Hello, when I try to run the RL script in the README on tasks from Metaworld (assembly-v0 or coffee-push-v0), I run into an error for a dimension mismatch in the evaluation. I was able to fix this by downsampling the image in two locations in the code. However, now when I run the coffee push task, after 1,760,000 training frames, I get a video as the one shown below:
https://github.com/TEA-Lab/diffusion_reward/assets/46321065/9a5437b4-88e7-4d17-97f6-7179c43e9b96
Is this the expected behavior? It seems as though it should be able to learn to push the cup upright in fewer training frames. Similarly, the assembly task after over 2 million training frames appears as below:
https://github.com/TEA-Lab/diffusion_reward/assets/46321065/a93df2a3-72f4-461d-a941-f96d259507df
Thank you!
Hi @eugeniafeng , apologies for the delayed response due to some urgent deadlines, and thank you for your patience.
As for the dimension mismatch issue you encountered, it seems that you need to set the image resolution to 64 x 64
when initializing. Based on the video you provided, the image resolution is far larger than 64. However, this may not be the main reason for your following two videos.
For video 1, yes the robot arm may behave like this, and this actually is not what we expect. MetaWorld does not provide rewards for keeping the cup upright, so it is hard to control the behavior. Interestingly, I checked our video logs and found that with fewer training frames, the arm tends to knock the cup down, moving at high speed. In contrast, if you train it longer, the movement of the arm will slowly become gentle.
For video 2, it seems that you did not train for enough frames. We trained Assembly for 15M frames. I kindly refer you to Figure 7 in our paper.
If you have more questions, feel free to add a comment :)
Hello! I am looking to reproduce the results in the paper, but I do not see any scripts for evaluating or testing the model. There is a link to a Metaworld script (demo_sawyer.py) in the README, but the Metaworld repository appears to be missing the imports that this script requires. Could you advise me on how to test the model? Thank you.