tum-vision / lsd_slam

LSD-SLAM
GNU General Public License v3.0
2.59k stars 1.23k forks source link

Calculating the absolute pose of the camera relative to the world frame #190

Closed ank700 closed 8 years ago

ank700 commented 8 years ago

Hello, In the lsd-slam paper, it is mentioned that - "The tracking component continuously tracks new camera images. That is, it estimates their rigid body pose ξ ∈ se(3) with respect to the current keyframe, using the pose of the previous frame as initialization."
Considering this, I save the pose values of 2 consecutive keyframes (in one run of the loop) and multiply them using the tf library to get the final absolute pose of the camera relative to the world frame. Then, I multiply this transform value with the new pose value of the next keyframe and so on. But, when this result is visualized in rviz, it does not show the correct motion as I am moving the camera. I only move it sideways or forward to check.

Can anyone suggest a way to correctly obtain the absolute pose values? I also saw #131 , but it does not answer my question.

Please ask me if my problem is unclear. Hoping to get a solution soon. Thanks

johnny871227 commented 8 years ago

Hi, may I know how did you save the pose? From camToWorld.data()?

ank700 commented 8 years ago

I am subscribing to pose values (published by lsd slam) in my own node. I made a publisher which publishes the keyframe id so , I can know in my node which pose value is associated with the keyframe.

johnny871227 commented 8 years ago

So I suppose you saved the values from camToWorld.data() with its corresponding keyframe? I did that too, but refer to my question #192 I found those poses can be really confusing. Anyway here are my 2 guesses:

  1. What we saved (or at least I saved) from camToWorld.data() should be absolute pose, not relative pose. Because it's named camera to world.
  2. "rostopic echo -p /lsd_slam/pose > my_pose" gives me results closer to the groundtruth, while camToWorld.data() is way worse.
ank700 commented 8 years ago

Hi @johnny871227 , I also logged the pose values. I observed that it gives almost correct values in z-axis, but not-so-correct values in x and y. I moved the camera first ~60cms in y then ~40cms in z and finally ~10cms in x. The last few pose values logged show that the movement in z axis is very close to "groundtruth", in x is ok and y is just 6cms.

I tried to move it in just the z axis again, but it gave false values. I moved 1.2 metres in z, but the last pose logged is only 35 cms. Also, the orientation is never constant.

In the lsd slam viewer, it seems as if it creates almost an exact trajectory. Is there more computation or something happening in the viewer code?

johnny871227 commented 8 years ago

Hi @ank700 , one thing I noticed is that they have an optimize function in Line 122 of KeyKrameGraph.h, and it is indicated that "Optimizes the graph. Does not update the keyframe poses, only the vertex poses. You must call updateKeyFramePoses() afterwards. " And this optimize thing is calling g2o lib. So I guess the answer is yes to your question.

However I can't find this " updateKeyFramePoses() " as they commented. I'm also trying to log the poses AFTER optimization. Please let me know if you get any clues.

ank700 commented 8 years ago

Hi @johnny871227 , optimize is called in slamsystem.cpp at line 1608. Then, I think it is updating the keyframe poses in line 1642.

One important thing- According to the paper at http://wp.doc.ic.ac.uk/lnardi/wp-content/uploads/sites/68/2014/06/2015_06_16_AndrewJackFinalReport.pdf and the lsd slam paper itself, Tracking is performed at a local level, providing pose estimates between incoming frames and the current key-frame. These keyframes are discarded based on a certain distance criteria. Read the above paper, specially the lsd slam algorithm explanation from page 41.

This means that the pose values are incremental relative to the current keyframe. So, when a new keyframe is decided, the pose values of the subsequent frames will be relative to this keyframe. Let me know what you think after reading the paper.

ank700 commented 8 years ago

Hi @johnny871227 , did you have any progress? Can you share with me?

The way that I am calculating the absolute pose using transforms gives somewhat good values but they are very non-deterministic. Also, according to my calculation, even if I move the camera backwards, it adds up the pose values (kind of giving the total distance travelled). I am very confused now.

johnny871227 commented 8 years ago

Hi @ank700 , sorry for the late reply as I was on vacation. I don't have any successful progress yet. All the poses I recorded were not correct when I used them to project the point clouds to 3D space. The accumulated point cloud looks quite messy. However when I run lsd-slam, the point cloud shown in the lsd_slam_viewer looks way better than my cloud. So the pose values here must be correct. I guess this is because of their optimization process. I'm looking into this part now and will let you know if I found anything useful. Please keep me updated as well. Actually I got a deadline this week so I'd better figure it out asap :(

ank700 commented 8 years ago

Hi @johnny871227 , what I have observed is that the trajectory reconstruction or mapping in the lsd slam viewer is also not consistent for the same path every time. The author has mentioned this in the readme. I did a lot of experiments with various paths but every time I got very different results. I hope you find a conclusion. You might be busy this week, but please update me whenever possible.

johnny871227 commented 8 years ago

Hi @ank700 , from what I have tested so far, the keyframe poses I save from camToWorld should be correct. They use it for 2D->3D projection as Sophus::Vector3f pt = camToWorld * (Sophus::Vector3f((x*fxi + cxi), (y*fyi + cyi), 1) * depth) around Line 330 of KeyFrameDisplay.cpp. After I saved the individual point cloud of each keyframe, they look consistent when I put them togather. Those poses are different from what I saved using "rostopic echo -p /lsd_slam/pose > my_pose". This might be the reason that my previous results were all wrong.

I'm 100% sure yet so maybe you can test on your data and let me know the result? I simply logged the pose values using (int j=0; j<7; j++){std::cout<<camToWorld.data()[j]<<" ";}. This gives me tx ty tz qw qx qy qz.

ank700 commented 8 years ago

@johnny871227 , The pose values published in /lsd_slam/pose are not the absolute poses. These values are actually relative to the current keyframe. So, when a new current keyframe is decided, the next pose values are relative to this keyframe. Hence, a transform is required if we want to use these pose values.

where do you exactly log the camToWorld values? I will also do the same and compare with my results.

johnny871227 commented 8 years ago

I added (int j=0; j<7; j++){std::cout<<camToWorld.data()[j]<<" ";} around Line 360 of KeyFrameDisplay.cpp, right before the loop for(int i=0;i<num;i++). When I run lsd-slam, I press "p" on the point cloud viewer window and the values will be shown in the cmd window. I believe those poses are global, unless there are some relative->global tricks inside Sophus::Vector3f pt = camToWorld * (Sophus::Vector3f((x*fxi + cxi), (y*fyi + cyi), 1) * depth);

ank700 commented 8 years ago

I printed the camToWorld values in the void KeyFrameDisplay::setFrom(....) function in keyframedisplay.cpp. This function is called continously in the main loop in main_viewer.cpp . The values in camToWorld are completely same as published in /lsd_slam/pose. I moved the camera along a measured path. The values in camToWorld are not correct at all (considering I need the cam position relative to World and not the current keyframe).

Also, the trajectory shown in the viewer itself does not show the exact path(or even close to it).

But, I am able to get almost correct values in the z-axis by doing the transform stuff.

johnny871227 commented 8 years ago

What do you mean by " same as published in /lsd_slam/pose. "? May I know how did you save or view this /lsd_slam/pose? Btw the quaternion here is in the order that qw comes first, you considered that right?

ank700 commented 8 years ago

I am using lsd_slam with the ROS wrapper. In this wrapper, it publishes the pose data associated with each frame. (You can check the wrapper at https://github.com/icoderaven/lsd_slam) I printed the values in the terminal using : cout << camToWorld.translation()[0] << camToWorld.translation()[1] << camToWorld.translation()[2] << endl;
Wrote the above after memcpy(...); in void KeyFrameDisplay::setFrom(...). Then I just compared the values printed in the terminal and the ones published by lsd_slam/pose

The order does not matter, because camToWorld is SE3 type which contains camToWorld.translation()[0/1/2] and camToWorld.so3().unit_quaternion().x/y/z/w() . You can check it in the ROSOutput3DWrapper.cpp file in the link I mentioned above.

johnny871227 commented 8 years ago

Hi, I don't know the details of your experiment but for me at least the first 10 keyframe poses are correct in terms of point cloud fusion. My purpose is to fuse each keyframe point cloud togather just like they did in the viewer, but I did it using my own code. Now I'm able to reproduce such point cloud fusion so the poses are good enough for me .

I guess for your problem, if your groundtruth pose values are different from lsd-slam, it's possible that the 2 trajectories are still the same but there's some kind of mapping between them.

ank700 commented 8 years ago

You are right about the pose values, they are relative to world. In case of moving in the real world, there is a huge error in the values given by lsd slam and the ground-truth. I moved the camera in the forward direction for 10metres, and the result was 1.14 meter which is terrible.

TristanWalsh commented 8 years ago

Hey guys, I am also trying to extract the pointclouds associated with each individual keyframe. Could either of you explain how you managed this?

I have extracted poses using echo rostopic, but it seems like you found that these were inaccurate. I have also been unable to extract the pointcloud data from any topic.

From what I understand from this thread, if I do the following I will obtain pose data.

I added (int j=0; j<7; j++){std::cout<<camToWorld.data()[j]<<" ";} around Line 360 of KeyFrameDisplay.cpp, right before the loop for(int i=0;i<num;i++). When I run lsd-slam, I press "p" on the point cloud viewer window and the values will be shown in the cmd window.

But how can I obtain the pointcloud/ depth map data?

ank700 commented 8 years ago

@TristanWalsh I did not use the pointclouds. Related to the pose values, I did not know about a very important concept regarding monocular vo algorithms. We cannot get the pose values in metric scale unless we have some external reference to groundtruth or object size or a marker. So, according to my last comment 1.14 could be correct (maybe not accurate) according to the algorithm's internal scale. Isn't there some rostopic for pointcloud data?

TristanWalsh commented 8 years ago

Yeah we are mapping a feature of known dimensions, so we wish to extract all the data from lsd-slam and scale it to the 'real world' values.

Within the keyframes topic there are field.camtoWorld0 - field.camtoWorld6 (the pose values), field.fx,fy,cx,cy (calibration values), field.height, width (dimesnions of frame) and field.pointCloudx (where x is 0 - an unknown amount)

The csv file I saved this data to is over 500mb however, and normal spreadsheet software cannot display all the columns. We assumed there would be a field.pointcloud value for every pixel in the image, corresponding to the depth of the given pixel. This would mean there should be 432 x 688 = around 300,000 pointcloud values. However using a python script to extract as many columns of pointcloud values as possible resulted in over 1 million pointcloud values. So that assumption was incorrect. I remember reading somewhere (but can't find it now) that the pointcloud data recorded in this topic is some form of serial stream data or something, and thus is useless.

@johnny871227 did mention he could fuse each keyframes pointcloud together, so hopefully he figured out a method of extracting them.

hashim19 commented 8 years ago

@ank700 @johnny871227 can i use lsd_slam for odometry purposes ?. if there is any way i can get the pose in metric scale ?

ank700 commented 8 years ago

@hashim19 , I did not test with metric values. You will need to calculate the relation between the pose given by lsd and the actual movement(in metres). Then, with this information you can easily compute the scale. Generally, gps, imu or markers are used for this purpose.

hashim19 commented 8 years ago

@ank700 you have mentioned above that you wanted to use lsd_slam for visual odometry.. Did you use it then ?

ank700 commented 8 years ago

@hashim19 I did not use it because the memory consumption is very high and the pose results are not really good for my use.

Toumi0812 commented 7 years ago

Hi everyone, @johnny871227, How to compare pose with ground truth? For example, where can I find the desk sequence, machine sequence ground truth? and how to compare them?

Thanks