zhangganlin / GlORIE-SLAM

GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM
https://ganlinzhang.xyz/GlORIE-SLAM/
Apache License 2.0
89 stars 3 forks source link

Real Time Inference? #6

Closed Mio-Atse closed 2 months ago

Mio-Atse commented 2 months ago

Hello, great work.

I'm wondering if you will publish a real-time inference method for custom dataset or direct rgb input? If you don't have such a study, what should be the path to follow to create a real-time slam algorithm from rgb camera?

And as an extra question, have you considered using depth anything v2 model as depth estimation model instead of midas? Much better point cloud results can be obtained especially in indoor environments.

zhangganlin commented 2 months ago

Hi,

Currently I do not have plan for real-time implementation, in glorie-slam, the main bottlenecks are the mapping optimization and pointcloud deformation when the map size getting larger. For mapping optimization, currently we used a nerf-like occupancy network to encode the map, an alernative way to speed it up is to replace our map representation with other like Gaussian Splatting. You can check https://github.com/eriksandstroem/Splat-SLAM when it is released later. For pointcloud deformation, now we simply re-anchor the points once camera poses update, and send the updated pointcloud into faiss to find the neighboring points. Speeding up this part may need to use another way to find neighbors, which do not require much computation resources once pointcloud is deformed.

For depth anything v2, I did not try it myself, but it could be very easily to replace, see https://github.com/zhangganlin/GlORIE-SLAM/blob/main/src/mono_estimators.py, just changingget_mono_depth_estimator and predit_mono_depth a bit should work.