nickgkan / 3d_diffuser_actor

Code for the paper "3D Diffuser Actor: Policy Diffusion with 3D Scene Representations"
https://3d-diffuser-actor.github.io/
MIT License
159 stars 16 forks source link

Quaternion convention #3

Closed rakhimovv closed 3 months ago

rakhimovv commented 4 months ago

Hi, thank you for your excellent work!

I have a question regarding the quaternion convention. It appears that throughout the codebase, quaternions are expected to be in the xyzw format. For example, trajectories in packaged episodes, such as in RLBench, are in xyzw format.

However, I noticed that within the diffuser actor, when rotation_parametrization is set to 6D, the convert_rot function expects, and the unconvert_rot function returns, quaternions in the wxyz (rijk) format.

Could you please clarify whether this is intended behavior or a discrepancy? Thank you!

rakhimovv commented 4 months ago

Also, e.g. when working with RLBench, it expects xyzw format. But in the meantime Calvin seems to expect in wxyz format, if we take a look at convert_quaternion_to_euler function. And I do not see the rearrangement of terms anywhere in code.

twke18 commented 4 months ago

Thanks for your interest and detailed investigation of our code base!

Yes, you are correct. We didn't notice that the quaternion format of our code base and RLBench are inconsistent (wxyz vs. xyzw). It turns out the our released models on RLBench do not predict geometrically correct rotation matrix. Since we apply convert_rot function during training and unconvert_rot function during testing, our quaternion output is in the same format as the input.

We are investigating this issue and will share our updates.

twke18 commented 4 months ago

Hi,

We have investigated this issue and updated the code base accordingly.

For verification, replaying scenes from the demonstrations, we calculated l2 errors / l1 error / accuracy among predicted positions / quaternions / gripper openess and those from the demonstrations. As shown in the following figure, the black curves denote the old model and the yellow curves denote the new model. Both models achieve similar performance.

圖片

圖片

We have also tested both models in RLBench simulation. Likewise, both models have similar performance. We acknowledge that, due to incorrect assumption of quaternion format, our released models do not predict geometrically-correct rotation matrix on RLBench. However, since quaternions are converted consistently during training and testing, our released model can still predict precise rotation on RLBench. Hope we have addressed your concern.

rakhimovv commented 3 months ago

@twke18 Thank you for a quick answer and update! Looks very cool that it works in both cases.