Question about the Adroit environment?

gemcollector / RL-ViGen

This is the repo of "RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization"

MIT License

92 stars 12 forks source link

Question about the Adroit environment? #12

Closed coderlemon17 closed 7 months ago

coderlemon17 commented 7 months ago

Hi, thanks for your great work. However, I have trouble finding the Dexterous Hand Manipulation mentioned in Sec. 4.1.3. of the paper. I searched the whole repo for Adroit / color_hard, but I could not find any code related to the experiment results.

I noticed there's a separate branch called ViGen-adroit, do I have to checkout to that branch to use the generalization environment?

gemcollector commented 7 months ago

Hi, there! ~ Thanks for your question. Yes, you should use the other branch to run the dexterous hand environment. The dexterous hand environment based on VRL3 algorithm, and it should use demonstrations to learn these tasks. For simplicity and reproducibility, I create a new branch for this environment. :)

coderlemon17 commented 7 months ago

Hi, thanks for your reply, I have a few more questions.

Since V3RL is a 3-stage training framework, so do all algorithms reported in Fig. 6 (i.e. the aggregated generalization score of dexterous manipulation) utilize this framework (and additional data) or just the VRL3?
I noticed that the Robosuite also has the different-visual-vppearance setting (Appendix D.1.1), but there are no experiments reported on it. I wonder do you happen to have the generalization results of different baselines in this setting too?
For Appendix E.2 Wall Time, I'm not sure how the FPS is calculated. Is it total_frame_collected / total training time or frame_in_a_batch / time_to_calculate_a_batch? Also, shouldn't this metric heavily depend on the hardware that the algorithms are evaluated with? (i.e. Different CPU, GPU might result in significantly different results.)

gemcollector commented 7 months ago

Thanks for your question!

As mentioned in Appendix, we only use the stage 3 for training, you can check the code. The original paper and our experiments find that it is enough for achieving a good policy for our tasks.
Yea, we also show the results of different baselines (Figure 19) in Appendix .
You can check the logger.py to know the details of how to calculate the FPS, and the FPS indeed relies on your hardware. I test them on Tesla A40.

coderlemon17 commented 7 months ago

Thank you for your reply.

Sorry, I missed Fig. 19., it's indeed what I want. However, if it is possible, could you kindly provide the results for each separate task, and maybe the data for plotting? We would like to cite it in our paper.
Also about the evaluation pipeline regarding robosuite in Fig. 19, I haven't found a script for it. Based on my understanding:
- First, you train the agent in one fix scene in train mode.
- Then you evaluate the generalization performance for 100 trials. (10 trails in each scene under eval-$DIFFICULTY.
I think there's a small typo here, where eval_easy -> eval-easy

gemcollector commented 7 months ago

Thx for your checking! Could you provide me with your email? (or you can send me an email) I can give you the results, model, data etc..