shichaoy / cube_slam

CubeSLAM: Monocular 3D Object Detection and SLAM
Other
850 stars 233 forks source link

About the detected pose of the 3D object #7

Closed wuxiaolang closed 5 years ago

wuxiaolang commented 5 years ago

Hi shichaoy, thanks for your code, after a period of exploration, I have some questions to ask.

  1. I added your code to ORB-SLAM2. Drawing a cue(8 2D points) on a 2D image has achieved good results, but the calculated pose of the cube is wrong. As shown in the figure below, in the pangolin drawing window, the red point (bigger) is the position of the center point of the object(all frame) output by your code. I think the camera pose I entered should be correct. The first frame I take transToWorld as transToWorld_initial = [1 0 0 0; 0 0 1 0; 0 -1 0 1.7; 0 0 0 1], subsequent camera poses are taken as transToWorld = transToWorld_initial * (mCurrentFrame.mTcw)^(-1), where mCurrentFrame.mTcw is from ORB-SLAM2. In addition, when I draw the position of the object I use detected_cube->pos (and cube_ground_value.pose) I want to know where the problem is?

190731kitti3issue

  1. Although the effect of drawing cube on 2D image is not bad, there is still a big gap between it and what you showed in video. Is there any place where I need to optimize? Thank you again and look forward to your answer.
shichaoy commented 5 years ago

@wuxiaolang I think there is the transformations are correct. The problem is due to inaccurate cuboid detection. As I mentioned in the paper, on KITTI, our 3D detector is actually worse compared to deep networks. So a quick fix for you might be use other off-the-shell car detector as 3D object measurement. For me, actually c++ 3D cube detector is not feature complete compared to Matlab (no config3). so I actually use matlab to offline process it, save as txt, then read in orb_slam as measurement. Also need to tune some parameters and small tricks.

One small note for transformation: I modify the orb slam initialization function so that orb's slam is world frame is ground frame, not first camera poses. It will make life easier.