According to the paper, the depth video is used to obtain ground truth for avoiding manual labeling. I guess that it could be scan matching and global optimization.
I tried to reproduce the scenario in a 3D visualization tool (Rviz2) using ground truth data.
1, Shift the camera axes using rotation_translation_matrix.
2, Re-project 3D point cloud into camera axes.
The items in the 3D world jump at the centimeter level.
The length of axes in the simulator is 1 meter. Blue is the Z axe.
In theory, ground truth may also have noise.
Is it caused by the noise from the ground truth or my re-projection method?
According to the paper, the depth video is used to obtain ground truth for avoiding manual labeling. I guess that it could be scan matching and global optimization.
I tried to reproduce the scenario in a 3D visualization tool (Rviz2) using ground truth data. 1, Shift the camera axes using rotation_translation_matrix. 2, Re-project 3D point cloud into camera axes.
The items in the 3D world jump at the centimeter level. The length of axes in the simulator is 1 meter. Blue is the Z axe. In theory, ground truth may also have noise. Is it caused by the noise from the ground truth or my re-projection method?