Closed YiXiangChen1 closed 8 months ago
Hi, thanks for your interest! Our graphic cards are 40GB A100s.
For CALVIN, we used 6 gpus for our experiments. It took us 22 hour with a training iteration of 65,000.
For the setup of PerAct on RLBench, we used 6/7 gpus. It took us 6.5 days with a training iteration of 600,000.
For the setup of GNFactor on RLBench, we used 4 gpus. It tooks us 2.7 days with a training iteration of 600,000.
Hope this information helps.
Thanks for your quick reply! It's genuinely helpful!
For CALVIN, we used 6 gpus for our experiments. It took us 22 hour with a training iteration of 65,000.
Hi, Thanks for your great work!
Hi,
Thank you for the quick response. Could you add some ablation experiments? For example, why len(gripper_history) == 3; In the model design, why is there such an attention mechanism between gripper, action, context, and intrinsics? In the trajectory, why are there 20 interpolated points between two key poses? It seems that some variables or the design of the model structure are not very clear.
Hi,
We found that on RLBench, including gripper history helps to resolve ambiguity in predicting target gripper pose. Take stack_wine
task for example, the following shows the proprioception and target gripper position at certain key frames:
Key Target gripper pose Proprio gripper pose
T=0 [ 0.2668, -0.4018, 0.9723] [ 0.2785, -0.0082, 1.4719]
T=1 [ 0.3159, -0.4108, 0.9724] [ 0.2668, -0.4018, 0.9723]
T=2 [ 0.3160, -0.4108, 0.9961] [ 0.3159, -0.4108, 0.9724]
T=3 [ 0.3705, -0.1827, 0.8914] [ 0.3160, -0.4108, 0.9961]
T=4 [ 0.4064, 0.0127, 0.8920] [ 0.3705, -0.1827, 0.8914]
As you can see, for the second and third key frame, the target gripper poses are different despite that both key frames have the same proprioception.
Due to limited computational resources, we were/are not able to ablate every aspect and thoroughly search the hyper-parameters. Most of the architectural design and hyper-parameters were guessed by us. On the other hand, we believe the code base is well written and clearly documented, we would encourage you to try some different settings. You might get better performance!
Thks. I tested the weights you released (https://huggingface.co/katefgroup/3d_diffuser_actor/blob/main/diffuser_actor_calvin.pth), and there are slight differences from the results in the paper. Is this normal?
checkpoint | number | task1 | task2 | task3 | task4 | task5 | avg seq len | |
---|---|---|---|---|---|---|---|---|
paper | 1000 | 92.2 | 78.7 | 63.9 | 51.2 | 41.2 | 3.27 | |
test1 | 1000 | 90.6% | 76.5% | 61.8% | 49.1% | 39.3% | 3.173 | |
test2 | 1000 | 91.3% | 76.9% | 61.3% | 49.1% | 38.1% | 3.167 |
Hi, we have also observed the performance difference. Our conjecture is that IK has some noise, resulting in varying performance. We reported the run with the highest performance in the paper.
Hi, thanks for your interest! Our graphic cards are 40GB A100s.
For CALVIN, we used 6 gpus for our experiments. It took us 22 hour with a training iteration of 65,000.
For the setup of PerAct on RLBench, we used 6/7 gpus. It took us 6.5 days with a training iteration of 600,000.
For the setup of GNFactor on RLBench, we used 4 gpus. It tooks us 2.7 days with a training iteration of 600,000.
Hope this information helps.
Hi,Do you encounter the following problems in Rlbench multi-GPU training:blosc_extension.error: Error 33423360 : 'not a Blosc buffer or header info is corrupted '.If so, please tell me how to solve it. I have been troubled by this problem for a long time, thank you!
Hi, Thanks for your great work! I'm curious about the specific training details for RLBench and CALVIN. Could you kindly share the number of GPUs used for these tasks, as well as the duration of training for each? Your insights would be invaluable to me, thanks a lot!