nickgkan / 3d_diffuser_actor

Code for the paper "3D Diffuser Actor: Policy Diffusion with 3D Scene Representations"
https://3d-diffuser-actor.github.io/
MIT License
198 stars 24 forks source link

Update checkpoint for Calvin benchmark #48

Closed CeHao-NUS closed 1 month ago

CeHao-NUS commented 1 month ago

Thanks for the wonderful work! In the Important note, quaternion_format has been changed. So could you please also provide the updated checkpoint trained using the new quaternion_format?

Since we tested the current pre-trained weights, and the performance is slightly lower than that in the Arxiv paper. It will be great if we can evaluate the correct one. Thanks a lot.

This is the reproduction we did in 4 seeds. 【Task completed in a row】 Avg. Len
1 2 3 4 5
seed0 89.8 75.2 60.7 48.6 38.0 3.123
seed1 90.0 75.6 62.2 50.0 40.8 3.186
seed2 89.5 76.9 62.6 48.9 39.0 3.169
seed3 89.8 75.6 62.2 50.2 38.8 3.166
mean±std 89.8±0.2 75.8±0.6 61.9±0.7 49.4±0.7 39.2±1.0 3.161±0.023
Paper reported 93.8±0.01 80.3±0.0 66.2±0.01 53.3±0.02 41.2±0.01 3.348±0.04
nickgkan commented 1 month ago

Hi, Thanks for your interest in our work and trying to reproduce our results!

The quaternion format does not affect the CALVIN model. If you see the important note it says "Our released model weights of 3D Diffuser Actor assume input quaternions are in wxyz format. Yet, we didn't notice that CALVIN and RLBench simulation use different quaternion formats (wxyz and xyzw)." The quaternion format was always right on CALVIN. The above sentence meant that CALVIN and RLBench use different formats, yet we treated everything the same using wxyz, which is what CALVIN expects. We apologize for the lack of clarity of this sentence and we will update it.

That said, we do have a better model which is what we report in the paper. We are going to release it in the next few hours.

CeHao-NUS commented 1 month ago

Thanks for the reply. Now, I can understand that the quaternion formats are correct now. I really appreciate your help if you can also release the best checkpoint. Your work is excellent and almost solved the Calvin challenge!

twke18 commented 1 month ago

Hi,

Thanks for your interest! Please try this script to test our new model, which does not condition on history end-effector poses. The released model weights can be downloaded from here.

CeHao-NUS commented 1 month ago

Yes. Great! I will test this one and report my results. Thanks a lot. But the script link does not exist.

nickgkan commented 1 month ago

The url @twke18 pasted had a typo, use this https://github.com/nickgkan/3d_diffuser_actor/blob/master/scripts/train_trajectory_calvin_nohistory.sh.

CeHao-NUS commented 1 month ago
  1. I download the weights here: https://huggingface.co/katefgroup/3d_diffuser_actor/blob/main/diffuser_actor_calvin_nohistory.pth

  2. I want to run the pre-trained weights. And this is the file path of the weight. https://github.com/nickgkan/3d_diffuser_actor/blob/f9a719e88f25542655c3db101f35e0a67713745e/scripts/train_trajectory_calvin_nohistory.sh#L93

So I need to rename the file, or change the directory.

  1. I run 'test' part of nohistory. https://github.com/nickgkan/3d_diffuser_actor/blob/f9a719e88f25542655c3db101f35e0a67713745e/scripts/train_trajectory_calvin_nohistory.sh#L69

But a bug reported. Not sure if the weight and the code have any mismatch. Still checking now.


task: take the blue block and rotate it to the right
  0%|                                                                                                                                   | 0/60 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "online_evaluation_calvin/evaluate_policy.py", line 321, in <module>
    main(args)
  File "online_evaluation_calvin/evaluate_policy.py", line 293, in main
    evaluate_policy(model, env,
  File "online_evaluation_calvin/evaluate_policy.py", line 120, in evaluate_policy
    result, videos = evaluate_sequence(
  File "online_evaluation_calvin/evaluate_policy.py", line 184, in evaluate_sequence
    success, video = rollout(env, model, task_checker,
  File "online_evaluation_calvin/evaluate_policy.py", line 226, in rollout
    trajectory = model.step(obs, lang_embeddings)
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/online_evaluation_calvin/evaluate_model.py", line 172, in step
    trajectory = self.policy(
  File "/home/labrynth/anaconda3/envs/3d_diffuser_actor/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/trajectory_optimization/diffuser_actor.py", line 332, in forward
    return self.compute_trajectory(
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/trajectory_optimization/diffuser_actor.py", line 205, in compute_trajectory
    fixed_inputs = self.encode_inputs(
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/trajectory_optimization/diffuser_actor.py", line 103, in encode_inputs
    adaln_gripper_feats, _ = self.encoder.encode_curr_gripper(
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/encoder.py", line 105, in encode_curr_gripper
    return self._encode_gripper(curr_gripper, self.curr_gripper_embed,
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/encoder.py", line 153, in _encode_gripper
    gripper_feats = self.gripper_context_head(
  File "/home/labrynth/anaconda3/envs/3d_diffuser_actor/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/layers.py", line 402, in forward
    query = self.attn_layers[i](
  File "/home/labrynth/anaconda3/envs/3d_diffuser_actor/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/layers.py", line 341, in forward
    attn_output, _ = self.multihead_attn(
  File "/home/labrynth/anaconda3/envs/3d_diffuser_actor/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/multihead_custom_attention.py", line 148, in forward
    return multi_head_attention_forward(
  File "/home/labrynth/yg/github_space/3d_diffuser_actor_ce/diffuser_actor/utils/multihead_custom_attention.py", line 359, in multi_head_attention_forward
    q = q.contiguous().view(tgt_len, bsz * num_heads, head_dim).transpose(0, 1)

RuntimeError: shape '[1, 8, 24]' is invalid for input of size 576
``
twke18 commented 1 month ago

Hi,

Could you verify if the num_history is set to 1?

Also, if you have pulled the latest commit.

CeHao-NUS commented 1 month ago

Yes. In the script, num_history=1 https://github.com/nickgkan/3d_diffuser_actor/blob/f9a719e88f25542655c3db101f35e0a67713745e/scripts/train_trajectory_calvin_nohistory.sh#L10

twke18 commented 1 month ago

Sorry I messed up the git commit. Can you pull the repo and check if you still have the error message ?

CeHao-NUS commented 1 month ago

Wow, the new checkpoint is so good. 1/5 : 94.8% | 2/5 : 81.2% | 3/5 : 65.0% | 4/5 : 51.9% | 5/5 : 40.2% ||

In this case, I believe you almost solved Calvin challenge.