HengyiWang / Co-SLAM

[CVPR'23] Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
https://hengyiwang.github.io/projects/CoSLAM.html
Apache License 2.0
402 stars 36 forks source link

[Question] How to add support for other stereo cameras #45

Closed sumitanilkshirsagar closed 5 months ago

sumitanilkshirsagar commented 6 months ago

Hi, First of all amazing work! I am trying to do underwater 3D reconstruction , and was wondering if I could use your codebase. Our setup is a ZED2i camera in an underwater housing. We do opencv checkerboard pattern calibration , but do not get proper tracking with zed sdk's spatial mapping. I think NERF like approach is essential since it can properly model water which traditional approaches might fail to do. Please share your thoughts if you find this interesting :)

PS. I think I need to create a new dataset class in https://github.com/HengyiWang/Co-SLAM/blob/main/datasets/dataset.py , can you give an explaination on that?

HengyiWang commented 6 months ago

Hi @sumitanilkshirsagar, thank you for your interest in our work. I am not very familiar with underwater 3D reconstruction, but Co-SLAM may not work in that case as we model TSDF instead of volume density. I would suggest maybe starting with some recent Gaussian-splatting SLAM methods (e.g. https://github.com/spla-tam/SplaTAM) if you want to use Neural FIeld SLAM for underwater tracking and reconstruction. But again, I think you may still need to do lots of modifications to make it work. And that would be a cool work!

Please feel free to ask me if you have any further questions. Good Luck!

sumitanilkshirsagar commented 6 months ago

Hi @HengyiWang , Thanks for your reply. I will try out Gaussian-splatting SLAM methods. I just needed (awesome and recent) a base code of 3D reconstruction to star with and then I wish to observe output of each stage, and hopefully figure out which steps of the pipeline need modifications for underwater data.

Also I need a small help , I am having trouble understanding what extra input I need to provide to get the code running. I was under the impression it is enough to give images and corresponding depth images. But it seems another file such as (odometry.csv in case of iphone capture in Co-Slam) are also needed. I am a bit confused as to their content. Does it contain the pose of camera? (I have zed camera which has build in imu, whose output I can extract.)

HengyiWang commented 6 months ago

Hi @sumitanilkshirsagar, I see your point. For the dataset, we load the gt poses for evaluation purposes. However, you do not need it to run Co-SLAM.

If you do not have gt poses, then you can simply set it to identity to make it run. For instance, check this part https://github.com/HengyiWang/Co-SLAM/blob/2329d09cc27133e710cf5e9ea1d9aaa4fa2b6662/datasets/dataset.py#L452

For the iPhone dataset, the app for capturing provides poses obtained by an off-the-shelf SLAM method, and we just use it as GT for evaluation.