HKUST-3DV / DIM-SLAM

This is official repo for ICLR 2023 Paper "DENSE RGB SLAM WITH NEURAL IMPLICIT MAPS"
197 stars 11 forks source link

about window optimization #15

Closed llianxu closed 10 months ago

llianxu commented 10 months ago

Thanks for your impressive work! I want to know that In the initialization phase, only fifteen frames were used, and only the first frame was added to the global keyframe. However, window optimization requires twenty-one frames. So, how should the subsequent sixteenth frame be optimized within the window?

poptree commented 10 months ago

Hi,

Specifically, the first 15 frames will all be added into the global keyframe set, but only at a time. The number of the frame starts from 0. During initialization, only the first two frames are fixed. After initialization, the first four frames will be added into the global frameset, and the frames 5-14 and the 15 frames will be added into the 11-frames local windows.

During tracking and mapping, you can add a new frame into the local window and pop the oldest frame. The oldest frame will be regarded as the new keyframe and inserted into the global frameset if frame index <=14 or flow >= 20.

At the beginning, since the length of the global keyframe set is not greater than 10, the total number of frames within the optimization window will less than 21.

llianxu commented 10 months ago

@poptree Hello,Thank you for your detailed explanation! I understand now. What you mean is that the local window only considers frames from k-10 to k, rather than k-5 to k+5, right? And I would like to know that in a single-threaded implementation, our tracking and mapping are actually combined and optimized within the same function? image

poptree commented 10 months ago

Hi,

k-10 to k and k-5 to k+5 are the same implementation. Let's say there are ten or eleven continuous frames in the local window set, if you define the k as the latest frame in the set, the formulation will be k-10 to k. If you use the middle frame as the "k" frame, the index of the frame in the local window set will be k-5 to k+5. The important thing is we should keep some continuous frames in the local windows set, but not how to formulate the index of frame in the local window set.

And I would like to know that in a single-threaded implementation, our tracking and mapping are actually combined and optimized within the same function.

Yes. In single thread implementation, the "sfm" function optimizes both camera and sense like "photometric bundle adjustment". In the tracking thread of the two threads implementation, the nerf parameters are fixed, which makes the "sfm" act like "motion-only bundle adjustment"

llianxu commented 10 months ago

I see, thank you very much for your meticulous and patient reply! I wish you all the best.