HKUST-3DV / DIM-SLAM

This is official repo for ICLR 2023 Paper "DENSE RGB SLAM WITH NEURAL IMPLICIT MAPS"
195 stars 11 forks source link

Getting scaleless initialization #16

Closed drone-teddy closed 8 months ago

drone-teddy commented 8 months ago

Hello, thank you for sharing your great work! I'm interested in using the code for my project, but I don't have two ground truth poses. I've tried running the code without the ground truth poses (setting all the poses at an identity) but the sfm doesn't work so well. Is there a way for me initialize without any prior knowledge of the poses? (I'm don't need the absolute scale).

poptree commented 8 months ago

HI could you provide the figure of the pose and the rendered image?

drone-teddy commented 8 months ago

Thank you for the prompt response! I've attached the final keyframe poses and the rendered image

With GT poses

1485 keyframe_ape

Without GT poses

1485 keyframe_ape

I've modified the code to use only the first GT pose, and set the second pose to be the first pose too (i.e. assume all the poses are at pose 1)

poptree commented 8 months ago

Hi,

Do you change the fix_num parameter during init? The attached figure results from setting all camera poses to Identity and fix_num=0 (without scale correction). The observation is similar with the LLFF and the pose will converge in the first 1000 iterations, which much faster than BARF, as I mentioned on the README.

It would be best to decrease the lr for the high-resolution grid for the correct visualization since the default lr is so large that it prevents optimizing the high-resolution grid from overfitting. Another solution is resetting the grid after 500 iterations and re-optimizing the depth from the fake depth.

keyframe_ape

poptree commented 8 months ago

Hi,

I ran the whole sequence without the first two camera poses. The figure of the middle images are attached, and the depth is up to a scale.

0221_165 0316_55

drone-teddy commented 8 months ago

Thank you! I was able to get better result (I probably need to tune the parameters more). For the paper, did you use the first two ground truth poses, or did you assume poses to be at identity? I'm curious to know what the trajectory performance would be, if I were to try and reimplement your work using nice-slam. keyframe_ape 1485

poptree commented 8 months ago

Hi, as mentioned in the paper, I set the first two camera poses as GT. If you want to set the first two cameras as identity, you should reset/release the feature grid during initiation to avoid overfitting, as I mentioned above.

I will release a version of the whole system, implemented by Xunzhi Zheng@HKUST based on NICE-SLAM, this weekend.

poptree commented 8 months ago

Hi, You can find the system implementation on "system" branch.