yzqin / s4g-release

S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scene
35 stars 5 forks source link

The transform about the rotation matrices #9

Closed candace3698 closed 1 month ago

candace3698 commented 1 year ago

Hi, @yzqin I am interested in your project. However, I met several questions studying the paper and implementing the code.

How should I understand the rotation matrices I get from the output of the algorithm? I know that they are the poses predicted by the network, but it is the transform from what to the grasping pose?

According to the question above, I am wondering that if I want to implement the grasp how many transform I need to do? base to ee, ee to gripper, camera to base? Does it means that I should transform the grasping pose(camera coordinate?) I get from the network to robot coordinate?

Thank you

yzqin commented 10 months ago

hi @candace3698

It is the transform from world coordinate to the grasp coordinate. The xyz axis definition follow the same convention as previous work like GPD and PointNetGPD

ee, ee to gripper, camera to base. The world frame is only used to per-process the data, i.e. remove point outside of the table space so that it will not generate a grasp pose far away.