geturin / OAFD_Monocular

Master Thesis
MIT License
49 stars 7 forks source link

MiDaS and ORB-SLAM calibration #16

Open yeop-giraffe opened 1 month ago

yeop-giraffe commented 1 month ago

Hi, thank you for sharing great work! I am doing similar project so I have some questions regarding depth calibration and control

  1. Since ORB-SLAM can recognize obstacles, what is the reason for additionally using monocular depth estimation? As far as I know, Ego-planner needs pointcloud topic which can be generated with vslam.

  2. MiDaS lacks scale, and monocular camera-based VSLAM also lacks scale. Did you control the Tello drone using relative position or velocity control rather than global position control?

  3. I'm curious about how you calibrated MiDaS and VSLAM. If there is a paper about this project, I would appreciate it if you could share it.

Thank you for your time!

geturin commented 1 month ago

1.The point cloud obtained from ORB-SLAM is a sparse point cloud. In a single frame, you can typically track around 800 feature points and convert them into a sparse point cloud. Using such a sparse point cloud as input for the ego-planner for obstacle avoidance is challenging and does not yield satisfactory results.

2.The entire system operates based on the coordinate system of ORB-SLAM3. After computing the path, the drone's velocity is determined using PD control. We obtain altitude data once during takeoff and compare it with the altitude from ORB-SLAM3's coordinates to estimate an approximate scale, which is then incorporated into the PD control calculations. Since it's not possible to accurately obtain the scale between VSLAM and the real world, precise position control cannot be achieved.

3.Unfortunately, our university does not want us to publish our thesis online . However, the calibration process is not complicated. First, filter out the far-depth regions in the MiDaS depth map, then apply a least-squares method on the two sets of depth maps from the same frame to calculate the scale, and finally scale the MiDaS depth to fit the VSLAM coordinate system. You can find the presentation slides I used for the project in the README, or check the code at the following link: https://colab.research.google.com/drive/1DBsfeidmBSJjPWljxOosr8kUmDdkgCca?usp=sharing

yeop-giraffe commented 1 month ago

Thank you so much for your kind response!👍 I can understand now😊