shichaoy / cube_slam

CubeSLAM: Monocular 3D Object Detection and SLAM
Other
828 stars 230 forks source link

Questions on 3d cuboid detection. #2

Open marrblue opened 5 years ago

marrblue commented 5 years ago

Hi shichao! Thanks to your impressive work. I learnt a lot from your paper. However, I have a problem when I try to run detect_3d_cuboid module with ICL_NUIM dataset which your paper mentioned.

I used the first image of "Living Room 'lr kt2'" to test detect_3d_cuboid module. here is my input:

then I failed to detect any cuboids and got output like this

Configuration fails at corner 2, outside segment
Configuration fails at corner 2, outside segment
...//a lot of same tips like it

detail image: details

Then here are my questions (seems stupid I guess, but it is kind of hard for me a beginner XD

But I have no idea how to solve this problem in detail.For example,how to determine the height wrt world ground coordinate ? BTW I used another format ground truth poses " R|t" they provide to test again. It dosen't work either.

//Rotation Matrix| t
-0.999762 0.000000 -0.021799 0.790932
0.000000 1.000000 0.000000 1.300000
0.021799 0.000000 -0.999762 1.462270
//transform to TUM format
0.790932    1.3    1.46227          0    0.99994          0 -0.0109001

So have you had the same problems when you run cubeSLAM with ICL_NUIM dataset and how did you make it work at that time?

Hope for your reply. Thanks again :)

shichaoy commented 5 years ago

hi @UranusSong Thanks for the interest. Camera of ICL data is quite different from normal setup. Their f_y in K is negative. Usually camera axis is x right, y down, z forward. But their is x right, y up, z forward, left hand coordinate https://www.doc.ic.ac.uk/~ahanda/VaFRIC/codes.html So I transform them to normal camera setup by flip y axis, then also change the ground truth poses. For the first frame camera pose to ground, I use https://www.doc.ic.ac.uk/~ahanda/VaFRIC/livingRoom1.gt.freiburg. basically the first frame is parallel to ground, height is 2.25m

If you need, I can provide more help about transforming the camera axis.

marrblue commented 5 years ago

hi @shichaoy !Thanks for your reply! It help me a lot! I think I get your point about camera axis setup. I should transform it and try again. BTW I ran cuboid detection code with kitti odometry dataset successfully and it worked well. Good luck in your scientific research~ Thanks!

iunknown10 commented 5 years ago

Hi UranusSong,shichaoy kitti odometry dataset 如何设置 transToWolrd参数? 多谢!

shichaoy commented 5 years ago

@iunknown10 answered separately.

MinaHenein commented 4 years ago

@UranusSong I'm trying to run detect_cuboid on Kitti data, and failed to detect any cuboids, I've seen you successfully managed to do this, can you please share your configuration?

MinaHenein commented 4 years ago

@UranusSong I found my answer, I was using a wrong transform for the camera wrt the ground plane.

N-G17 commented 4 years ago

Hye @shichaoy. I am running detect_3d_cuboid on a SUN RGBD dataset image which is also shown in your CubeSLAM paper in figure 8 (middle Image in top row where a 3d cuboid is detected for a cycle). Here are my inputs:

And I am getting the following output and no cuboid is detected: 0008_sample_cam_roll_pitch_no

However, when I use your hardcoded transToWorld matrix in cube_slam/detect_3d_cuboid/src/main.cpp: transToWolrd << 1, 0.0011, 0.0004, 0, 0, -0.3376, 0.9413, 0, 0.0011, -0.9413, -0.3376, 1.35, 0, 0, 0, 1; I am able to get the ouput as: 0008_OriginalTranstoWorld

I am unable to figure out a reason as to why this is happening. I would be grateful if you could provide a brief explanation.

Hoping for your response. Thank you.

shichaoy commented 4 years ago

Hi @N-G17 I think your extrinsic is wrong. I checked my old data, and find that the extrinsic for this image is: 0.9746 0.2221 0.0304 0.0304 -0.2650 0.9638 0.2221 -0.9383 -0.2650 1.47 0 0 0 1

since most images are taken when compare pointing forward/upward,so all R_ground_camera should be close to 1 0 0 0 0 1 0 -1 0 see coordinate frame explanation https://github.com/shichaoy/matlab_cuboid_detect/blob/master/illustrations.pdf

R_ground_camera is very important because it determines the vanishing point. I remember raw SUN RGBD extrinsic Rtilt is close to identity, therefore we need to swap some exis something like this Rtilt * [1 0 0;0 0 1;0 -1 0];

benchun123 commented 3 years ago

hi, @UranusSong @shichaoy , I also met the problem when I applied CubeSLAM on ICL_NUIM dataset, can you provide me more help about transforming the camera axis? or how to change the ground truth of camera? Following is my experiment, but failed:

original

Kalib<< fx,  0,  cx,  
        0,  -fy,  cy,
        0,  0,   1;  // all elements > 0
    Eigen::MatrixXd cam_pose_Twc = truth_frame_poses.row(frame_index).tail<7>(); // xyz, q1234
    Matrix<double,4,4> transToWolrd;
    transToWolrd.setIdentity();
    transToWolrd.block(0,0,3,3) = Quaterniond(cam_pose_Twc(6),cam_pose_Twc(3),cam_pose_Twc(4),cam_pose_Twc(5)).toRotationMatrix();
    transToWolrd.col(3).head(3) = Eigen::Vector3d(cam_pose_Twc(0), cam_pose_Twc(1), -cam_pose_Twc(2));
    std::cout << "transToWolrd: \n" << transToWolrd << std::endl;
    Eigen::Vector3d orientation;
    rot_to_euler_zyx<double>(transToWolrd.block(0,0,3,3), orientation(0), orientation(1), orientation(2));
    std::cout << "camera orientation: " << orientation.transpose() << std::endl;

camera orientation: 0 -0 0 transToWolrd: 1 0 0 0 0 1 0 0 0 0 1 2.25 0 0 0 1

however, I can get cuboid if I change the camera roll for the first frame as following, but I do no know the correct way to change them?

    Eigen::Vector3d orientation;
    orientation << -1.5708, 0, 0;
    transToWolrd.block(0,0,3,3) = euler_zyx_to_rot(orientation(0), orientation(1), orientation(2));
    transToWolrd.col(3).head(3) = Eigen::Vector3d(0,-2.25,0);
    std::cout << "transToWolrd: \n" << transToWolrd << std::endl; 

where camera orientation: -1.57079 0 0 transToWolrd: 1 -0 0 0 0 1.32679e-06 1 0 -0 -1 1.32679e-06 2.25 0 0 0 1

shichaoy commented 3 years ago

@benchun123 thanks for trying out. yeah ICL-nuil dataset is special due to different calibration. As mentioned in the ICL website https://www.doc.ic.ac.uk/~ahanda/VaFRIC/codes.html, camera frame is x right, y up, z forward. However usually and also in my system (https://github.com/shichaoy/matlab_cuboid_detect/blob/master/illustrations.pdf), the camera frame is x right, y down, z forward. My code has to use this assumption, because the camera frame affects the ground direction and vanishing points.

So to run cubeSLAM on ICL nuim data, just need to feed in a correction calibration: [ 481.20, 0, 319.50 0, 480.00, 239.50 // second number becomes positive now! 0, 0, 1 ]

Then providing the initial transform: horizontal to ground with height ~2, similar to KITTI: T_world_cam= [1 0 0 0 0 0 1 0 0 -1 1 2.25 0 0 0 1 ]

Note that if we later want to compare with the provided ground truth pose, we need to do some extra work, because the ground truth pose is defined at old camera frame. so just need to apply the following transform to raw ground truth poses: T_rawCam_newCam = [1 0 0 0;
0 -1 0 0; 0 0 1 0; 0 0 0 1];

let me know if it's not clear.

benchun123 commented 3 years ago

@shichaoy Thanks for your reply, It is clear about the dataset, but I still have problems when trying it out. Firstly, the initial transform, horizontal to ground with height, is (-1.57079 0 0) in roll, pitch, and yaw, but why it is not (0, 0, 0)? it is just a setting in KITTI or not? I can get the cuboid, I just don't understand.

Secondly, when using offline mode in orb-cube-slam, I think in every frame, we just need to feed the initial transform to detect cuboid, and then, transfer the detected cuboid in current camera pose? If online, we use slam pose to detect cuboid. Is my understanding correct or not.

Thanks so much for your help

bhargavi-git-hub commented 3 years ago

@shichaoy I have been trying to use 3D cuboid detect on KITTI dataset. With the available parameters of intrinsic and extrinsics matrix for KITTI dataset, no cuboid is detections were found. Could you please give more insight on the Rotation matrix which is used in code ? Is it extrinsic Rotation matrix or not? if is not, how do we get those values?

Thanks!

bhargavi-git-hub commented 3 years ago

@UranusSong I found my answer, I was using a wrong transform for the camera wrt the ground plane.

@MinaHenein Can you please tell which parameters of the KITTI dataset did you used as Rotation of camera wrt ground?

zhixun25 commented 1 year ago

@UranusSong I found my answer, I was using a wrong transform for the camera wrt the ground plane.

@MinaHenein I'm also trying to run detect_cuboid.m on the Kitti data, but it fails to detect any cuboids. I see you used the wrong transform for the camera wrt the ground plane. Can you share your configuration? Thanks!

zhixun25 commented 1 year ago

Hi UranusSong,shichaoy kitti odometry dataset 如何设置 transToWolrd参数? 多谢!

你好,请问你知道kitti odometry dataset 如何设置 transToWolrd参数了吗? 多谢!

ZhaoJunnNie commented 1 year ago

Then providing the initial transform: horizontal to ground with height ~2, similar to KITTI: T_world_cam= [1 0 0 0 0 0 1 0 0 -1 1 2.25 0 0 0 1 ]

hello,i guess you wrote a mistake here T_world_cam? is it [1 0 0 0 0 0 1 0 0 -1 0 2.25 0 0 0 1]?