yuxng / YCB_Video_toolbox

Toolbox for the YCB-Video dataset
https://rse-lab.cs.washington.edu/projects/posecnn/
MIT License
148 stars 38 forks source link

how is the rotation for the object coordinate system O defined #11

Closed huckl3b3rry87 closed 4 years ago

huckl3b3rry87 commented 4 years ago

I am using the YCB video dataset and I am trying to visualize the pose estimates from my algorithm, but I am not sure where the object started from.

For instance, in the case of the mustard bottle, if it is at a the origin, where is the ROS TF set? eg

tf

According to https://arxiv.org/pdf/1711.00199.pdf, it seems that O is the centroid of the object, but what about rotation? Is there a place that I can find this information?

Thank you!

huckl3b3rry87 commented 4 years ago

As per https://github.com/j96w/DenseFusion/issues/88 the canonical frame is the mesh frame.

taeyeopl commented 4 years ago

I have a question regarding pose annotation.

As I understood, *-meta.mat file contain camera pose (rotation_translation_matrix: RT of the camera motion in 3D).

If I have two images t and t+1 frame. Most Scenario, I understood objects were fixed and only the camera pose was moved. Therefore, I thought the relative camera pose(t,t+1) would be the same as the other object relative pose(t,t+1). But it wasn't and each object has a different relative object pose.

Q1. Can you advice which part I am missing ??

This is my sudo code. cam_relative means relative camera pose obj_relative means relative object pose.

            tmp = np.zeros((1, 4))
            tmp[..., -1] = 1
            # camera pose_relative
            cam_pose_a= np.concatenate((cam_pose,tmp))
            cam_pose_b= np.concatenate((cam_pose2,tmp))
            inv_cam_pose = np.linalg.inv(cam_pose_a)
            cam_relative = np.matmul(inv_cam_pose,cam_pose_b)

            np.set_printoptions(suppress=True)

            # object pose_relative
            obj_pose_a = np.concatenate((pose_label, tmp))
            obj_pose_b = np.concatenate((pose_label2, tmp))
            inv_obj_pose_a = np.linalg.inv(obj_pose_a)
            obj_relative = np.matmul(inv_obj_pose_a, obj_pose_b)

            # pose_new_from_cam_relative == obj_pose_b
            pose_new_from_cam = np.concatenate((pose_label,tmp))
            pose_new_from_cam = np.matmul(pose_new_from_cam,cam_relative)[:3,:]
            # pose_new_from_pose_relative == obj_pose_b
            pose_new_from_obj = np.concatenate((pose_label, tmp))
            pose_new_from_obj = np.matmul(pose_new_from_obj, obj_relative)[:3, :]