real-stanford / diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
https://diffusion-policy.cs.columbia.edu/
MIT License
1.1k stars 206 forks source link

questions about reproduce on real ur5 robot #41

Open yolo01826 opened 6 months ago

yolo01826 commented 6 months ago

Hello: @cheng-chi Thank you very much for your work. Currently, I want to reproduce the diffusion policy on the real ur5 robot, but I still have the following questions. Can you give me some advice?

  1. I observed that you recorded more than a hundred demonstrations for the pushT task, and then I directly tried to use these demonstrations for training with your code. However, each epoch of training takes about 10 minutes on my computer. If it takes a long time to complete a task according to the 600 epochs in the configuration file, is this normal? What is the minimum epoch required to train a task? Because I noticed that it seems that you can train a policy in only 12 hours. My GPU is RTX3060 12G.
  2. In addition, I noticed that you do not have demonstration examples for the cup-righting and spilling tasks. If I want to train a brand new task, do I only need to use the script you provided on github to perform the same operation? Are the action spaces the same for these different tasks? Thank you and looking forward to your reply!🥺
cheng-chi commented 6 months ago

Hi @yolo01826

  1. The training speed does sound a little slow for me, but it's likely due to the difference of RTX3060 vs 3090s that we used. You usually only need ~200 epoch to get peak performance. The experiments in the paper are significnatly overtrained on purpuse to show that diffusion policy is not prone to overfitting even if you train for longer.
  2. Unfortunately the data for cup and sauce spreading tasks are collected at TRI with their proprietry software infrastucture and therefore can't be open sourced. The action space is 6 DoF absolute end-effector-pose similar to what's used in robomimic experiments.