how to calculate actual distance of each point from depth map

Santa-Maria-Shithil commented 4 years ago

Hi! I am working on a robotics project where I need to reconstruct a 3D image from monocular vision. I have planned to reconstruct 3D image from depth map. For this purpose, I have installed lsd-slam and ran examples successfully. Now my question is, is it possible to calculate actual distance of each point of an image from it's depth map that is generated from lsd-slam?

bespoke-code commented 4 years ago

Hello @Santa-Maria-Shithil ,

if by

to calculate actual distance of each point of an image

you mean to calculate the real-world, metric distance of a point to a specific camera pose in 3D, then I'm afraid the answer is no.

LSD-SLAM is a VO (visual odometry) system. One of its drawbacks is the inability to recover or calculate the real-world scale of a mapped scene using only the inputs supplied to LSD-SLAM (i.e. the camera images). To do that, you would either have to specify a known length between 2 mapped points in the resulting pointcloud or modify the algorithm to make use of IMU, GPS or other sensor data in order to calculate real-world scale.

This holds true for all purely visual odometry systems.

To try and achieve what you set out to do in your project, you could try to get a depth estimate from a deep-learning-based MDE system like fast-depth or DORN. Note that all monocular depth estimation (MDE) systems' outputs are inherently biased by the training dataset used (distance ranges, scenes included etc). Some can only predict the distance range in which a point is predicted to be found relative to the camera. It's up to you to check and decide if such a system works for you.

I hope this helps! Best of luck with your project!

Henke1983 commented 4 years ago

Would it be possible to put in a cube for example 10x10x10mm that is the reference object? Will it be possible to use this a reference and then measure distance?

Santa-Maria-Shithil commented 4 years ago

@bespoke-code Thanks a lot for such a helpful reply. I am planning to use deep-learning-based MDE system for calculating depth. Using GPS sounds interesting and if I get enough time I will try it.

bespoke-code commented 4 years ago

@Henke1983 any object with known dimensions which can be easily and reliably detected in the images and/or in the output pointcloud can act as a reference. Multiple known objects across a scene with known distances between them can help a lot as well. It's up to the engineer to implement the e.g. corner detection, find the appropriate points in 3D, calculate and scale the pointcloud accordingly (real-world maps may also be skewed due to errors in 3D reconstruction, perspective, lens distortion, shutter effects etc). Professional aerial photogrammetry engineers actually place markers on site (on multiple locations) to get better scale estimates during 3D reconstruction with 'ortophoto'.

@Santa-Maria-Shithil, go for it. If you have enough time, try modifying LSD-SLAM for use with stereo cameras. The known baseline between the cameras and the ability to get disparity between the image pairs should be sufficient to recover the true scale, although I can't comment on the accuracy of such a method. Check out this video on YouTube for more details :)

tum-vision / lsd_slam

how to calculate actual distance of each point from depth map #337