AlejandroSilvestri / ORB_SLAM2-documented

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities
Other
24 stars 3 forks source link

keyFrameTrajectory and ground-truth data not overlapping #3

Open sazak1 opened 4 years ago

sazak1 commented 4 years ago

Dear @AlejandroSilvestri first of all, thank you very much for the documentation of ORB-SLAM2. I have installed and build ORB-SLAM2 on my computer. I wanted to check the algorithm (monocular) with the EuRoC dataset. And I have used the V1_02_medium sequence from the EuRoC dataset. I ran it with the command specified by the author. The algorithm is running, I can see point cloud and keyframe position during the run. ORB-SLAM code creates keyFrameTrajectory.txt file that keeps the position and orientation information at the end of the run. I want to compare ORB-SLAM output and ground-truth data. But I couldn't draw the overlapping trajectory. I tried myself, I also tried EVO. None of them has worked. By the way, the author of EVO already says the code is not good at the EuRoC dataset it works better for the TUM dataset. How can I see the overlapping trajectories? Yes, monocular camera doesn't have depth information. Before A. Davison we didn't see a monocular SLAM application if I am not wrong. But the camera is moving. So it can use two sequential images for triangulation. If monocular SLAM cannot recover exact position without manual scaling or drifting after trajectory completed, then how we talk about the success of monocular SLAM algorithms? Could you please clarify me?

AlejandroSilvestri commented 4 years ago

@didnotwork

Monocular can't get scale. That's final. No one asks monocular slam to get scale, because it's impossible: the information is not there.

Monocular slam can get structure up to scale. Usually you can get scale from external source, like knowing the real distance between a pair of mappoints or such.

That's what visual inertial does: it combines IMU reading (accelerometers and gyroscopes) to grab the scale and fuse with monocular slam. IMU readings bring scale to map and avoid scale drifting. ORB-SLAM3 project adds the IMU reading capability.

To compare orb-slam2's trajectory with ground truth you must align and scale the trajectory.

sazak1 commented 4 years ago

Thank you very much for your reply. The monocular camera cannot cope with the accurate slam without any help from other sensors. That's OK. But what about the initialization method? What exactly does the method used for? Is it used only for increasing the accuracy of the "first localization"? For example in MonoSLAM, starting from the target of known size helps to scale issues. Is that enough? Is getting a reference only at the beginning enough? If yes, the ORB-SLAM method has also an initialization method (but different). What is the effect of the initialization method that is used in ORB-SLAM on scaling?

AlejandroSilvestri commented 4 years ago

@didnotwork

Scaling the initial map is not enough, because monocular slam suffers scale drifting. The system needs a scale tip periodically at least. IMU readings do this. Loop closure too, sort of.

Some old monocular slam asked for the user to initialize map moving 1 m. Because initialization is automatic, it is hard to control where it begins and ends, and on top of that the 1m displacement is never accurate. ORB-SLAM2 could have done the same initialization process, but that's not really relevant. That scaled initialization is dropped in modern monocular slam.

"Metric SLAM" initialize on a known object, so it gets the world scale, but only the initial scale.

sazak1 commented 4 years ago

I ran the ORB-SLAM algorithm with a stereo sequence from the EuRoC dataset. They did not overlap too. I thought that it would draw an accurate trajectory with stereo vision. When I create a chart, I have seen that there are drifts, scaling differences, and also direction problem(I think some columns need to be multiplied with -1) Below you can see ground truth and camera trajectory files. Shouldn't I get an overlapping result when I work with stereo vision? CameraTrajectory.txt data.xlsx

AlejandroSilvestri commented 4 years ago

@didnotwork

Stereo has the scale, not the direction. Usually you must to translate and rotate the trajectory to fit ground truth.

sazak1 commented 4 years ago

Sorry, I misrepresented the direction statement. I mean for example X-axis in ground truth file is similar to minus (Z-axis) in trajectory file.

  1. I changed some axis's sign, I have made all the beginning displacement values zero, and I shifted the data of some single axes back and forth on the time axis to make them overlap. The result is not good. Actually it is similar to the monocular camera result.
  2. There are 1600 rows in ground truth data and 16000 rows in ground truth data. So the resolution is very low. I share the results as pdf in the attachment. axes.pdf traject.pdf
AlejandroSilvestri commented 4 years ago

@didnotwork

I believe this can help: A tutorial on quantitative trajectory evaluation for visual intertial odometry

As you can see, the alignment problem is so complex that a paper addressing it were published two years ago. It addresses visual odometry, it works for visual slam too.

sazak1 commented 4 years ago

Thank you very much for your suggestion! It seems to be able to answer many questions in my mind. Actually I have Scaramuzza'S all the paper but I haven't read it yet. And I have just started to read it! Thanks again..