shichaoy / cube_slam

CubeSLAM: Monocular 3D Object Detection and SLAM
Other
850 stars 233 forks source link

In the package object_slam, why does the 3D cuboid detection use a constant transToWolrd? #14

Closed drscopus closed 5 years ago

drscopus commented 5 years ago

I am learning the package object_slam by reading the source code. I found that the cuboid detection is with a constant variable "transToWolrd", as the line https://github.com/shichaoy/cube_slam/blob/23304a9c25dc33d9ec16cdfbb2494800ec7dcda0/object_slam/src/main_obj.cpp#L449 shows The code "detect_cuboid_obj.detect_cuboid(raw_rgb_img,transToWolrd,raw_2d_objs,all_lines_raw, frames_cuboids);" can detect the cuboid with constant "transToWolrd", but I think the camera is moving around the target cuboid, the transToWolrd should change. So how does the "detect_3d_cuboid::detect_cuboid“ work with a constant input "transToWolrd". Thanks!

shichaoy commented 5 years ago

hi, transToWolrd is only used an initialization for detect_cuboid(). In the detection function, camera roll/pitch is sampled (whether_sample_cam_roll_pitch is set to true). Note only the camera roll/pitch/height affects the detection result. As stated in line 457, The detection output is a cuboid pose at local ground frame, which is a frame by projection camera pose onto the ground. Therefore, absolute x/y/yaw doesn't matter, as long as the fixed transToWolrd's roll/pitch/height provides a good initialization.

drscopus commented 5 years ago

Thank you very much for your reply! Is the "local ground frame" the current camera pose in the frame "world"(world coordinate system)? Is the origin of local ground frame on the camera? Are the roll/pitch/height of camera poses unchanged in this cuboid data? So you can use constant input of roll/pitch/height (i.e. the first truth camera pose) in detect_cuboid function for all input images, right? Are the output of cuboid detection( https://github.com/shichaoy/cube_slam/blob/23304a9c25dc33d9ec16cdfbb2494800ec7dcda0/object_slam/src/main_obj.cpp#L449) the 3D pose of cuboid in the coordinate system of current camera pose? Thanks!

drscopus commented 5 years ago

Hi, I think the roll/pitch/height are almost same, so I tested the cuboid detection output with different constant “transToWolrd". That is by changing the value in this code g2o::SE3Quat fixed_init_cam_pose_Twc(truth_frame_poses.row(0).tail<7>());. I change "0" to be "10" and "20". The results are showed as follows: When the value is set to be 10: rviz_screenshot_2019_11_13-11_50_09_row10 When the value is set to be 20: rviz_screenshot_2019_11_13-11_53_30_row20 They are quite different with the value 0, mainly in the orientation of detected cuboid: rviz_screenshot_2019_11_13-11_51_36_row0

shichaoy commented 5 years ago

@drscopus I think I didn't make it clear. the output of detect_cuboid_obj.detect_cuboid is the cuboid pose on the ground world frame same as input transToWolrd. namely if camera position is far away from cuboid, then the output of detect_cuboid() cuboid position is also far away. Then we transform the cube pose to camera frame, to get cuboid-camera measurement.

Therefore, we can also find that the xy position of ground world frame doesn't matter, because we will anyway finally convert to local camera frame. We can even put local ground as world frame at every detection call. Local ground is the top-down projection on camera onto the ground plane, so transToWorld has x=y=yaw=0.