zgojcic / Rigid3DSceneFlow

[CVPR 2021, Oral] "Weakly Supervised Learning of Rigid 3D Scene Flow"
137 stars 18 forks source link

The coordinate system of "pose" in semanticKITTI npz file #7

Closed MaxChanger closed 3 years ago

MaxChanger commented 3 years ago

Hi, thanks for the nice work. I currently want to perform a test on semanticKITTI. How is the 'pose' in the npz file defined and obtained? For example, I downloaded the complete tar archive, and load the data in 11/000000_000001.npz, for the frame_000000, the data['pose_s'] is an identity matrix, corresponding to the global coordinate system, and data['pose_t'] is same as the line-1 of pose.txt provided by semantickitti. Is this pose in the Camera coordinate system or the LiDAR coordinate system? Because I saw in the supplemental material that the LIDAR data should be projected to the Camera coordinate system. Is the pose here also in the camera coordinate system?

I have tried several ways, but I can’t convert the two frames of point clouds to one coordinate system through data['pose_t'] (In the former example, data['pose_s'] is the identity matrix). Can you give me some hints?

zgojcic commented 3 years ago

Hi Max, sorry for the extremely late reply. Pose_s and pose_t bring the source and target point cloud to a common coordinate system. Note that pose_s is not an identity matrix, pose_s[0,0] and pose_s[1,1] are -1.

For example, pc_s can be transformed to a common reference frame as:

pc_s_t = (np.matmul(data['pose_s'][:3,:3], data['pc1'].transpose()) + data['pose_s'][:3,3:4]).transpose()

The relative transformation matrix that brings the target pc to the source pc's coordinate system can be computed as:

T_t_s = np.linalg.inv(data['pose_s']) @ data['pose_t']

Please reopen the issue if something else is unclear.