JakobEngel / dso

Direct Sparse Odometry
GNU General Public License v3.0
2.27k stars 906 forks source link

idepth_zero and idepth of PointHessian are set at the same time #225

Open amberwood31 opened 3 years ago

amberwood31 commented 3 years ago

Hey, dear maintainer

I noticed that the idepth_zero and idepth of PointHessian are set at the same time in doStepFromBackup, which caused the deltaF of EFPoint to be always Zero. This would make the residuals of point idepth to be always Zero, which is puzzling to me.

Can you kindly let me know what's the logic behind this? I also noticed the specific line that sets the idepth_zero in doStepFromBackup comes from a later commit named "added a whole lot of dead code & unused variables". Maybe that line was a mistake?

Looking forward to your answer and your help is highly appreciated!

tanjunyao7 commented 3 years ago

idepth_zero corresponds to prior idepth information which we usually don't have. idepth - idepth_zero(deltaF) quantifies how far away from the prior knownledge(ground truth) we are. To make deltaF 0, idepth_zero should be set equal to idepth.

amberwood31 commented 3 years ago

Hi, thanks for your insight!

What you were saying was that since the prior is unknown, so deltaF should be zero to disable this part of information, right? This makes sense.

But the issue is, in PointFrameResidual::linearize function, the idepth_zero_scaled ( which is equal to idepth_zero since the scale setting is 1 by default) was used to compute the geometric terms of the Jacobian. This means the evaluation point of these Jacobians is changed at every optimization iteration. However, the DSO paper claimed that the geometric terms of the Jacobians are evaluated at x=0 to prevent adding linearization around different evaluation points.

In my opinion, setting idepth_zero in doStepFromBackup contradicted this claim...

tanjunyao7 commented 3 years ago

Hi, thanks for your insight!

What you were saying was that since the prior is unknown, so deltaF should be zero to disable this part of information, right? This makes sense.

But the issue is, in PointFrameResidual::linearize function, the idepth_zero_scaled ( which is equal to idepth_zero since the scale setting is 1 by default) was used to compute the geometric terms of the Jacobian. This means the evaluation point of these Jacobians is changed at every optimization iteration. However, the DSO paper claimed that the geometric terms of the Jacobians are evaluated at x=0 to prevent adding linearization around different evaluation points.

In my opinion, setting idepth_zero in doStepFromBackup contradicted this claim...

Hi, you are argueing that updating the linearization point of point depth at every optimization iteration violates the First Estimates Jacobean. That is indeed how it should be. As far as I understand, the FEJ principle arises from the observation that the overall objective function composes of two parts namely the factors in the current sliding window and the factors from the marginalized states. These two kinds of factors share common states which are those that are connected to the marginalized states at last step, i.e. all the FrameHessians in the active sliding window except for the most recent one. Because of this, their linearization points should always stay the same with what they were in the marginalized factors, otherwise inconsistancy is introduced into the system. Therefore the linearization point of FrameHessian is not changed at iterations. As for PointHessian, note that a point is connected to only one frame i.e. the frame that it resides. Once a PointHessian is marginalized, it never occurs further in subsequent sliding window: it dosen't introduce prior information for subsequent poses, nor it introduces prior information for subsequent points. Therefore it should be iteratively updated.

amberwood31 commented 3 years ago

Hi, thanks for your insights!

I agree with your statement on why fixing the linearization point is necessary after marginalization, however, I'm not sure whether that's the whole picture of FEJ used here, for these reasons:

  1. If FEJ is only about making sure the two parts in the objective function share the same linearization point after marginalization, why fix it from the very beginning? Shouldn't it only be fixed after the marginalization and the related Jacobian evaluation point fixed to the marginalization state? If I understood the code correctly, the linearization point of FrameHessian isn't updated any more after being added as a key frame, from https://github.com/JakobEngel/dso/blob/ae1d0b3d972367b0a384e587852bedcf5f414e69/src/FullSystem/FullSystemOptimize.cpp#L553 . This specific frame most probably won't be marginalized immediately, at least not until the sliding window contained enough keyframes or there's a sudden camera turn or brightness changes. So why fix the linearization point here?

  2. this is more about the way the code was written. If the idepth_zero was supposed to be set at every optimization iteration, I don't see why this variable existed in the first place, since it's exactly the same with the idepth variable. Also, in the PointFrameResidual::linearize function, projectPoint function were called twice, one using the frameHessian's fixed linearization point and pointHessians's idepth_zero, the other using the frameHessian's current state and pointHessian's idepth. The former was used to compute Jacobian of the pose, idepth and camera parameters, and the latter of the affine parameters and JI. In my opinion, this implementation is in conflict with the claims in the paper, "both $J{photo}$ and $J{geo}$ are evaluated at x = 0.", with $J{geo}$ including the point idepth and $J{photo}$ including the affine parameters. The specific way this function was structured gave me the impression that point idepth's linearization point were supposed to be handled in the same way with frameHessian, despite agreeing with you that yes, the points that are still left in the sliding window won't be part of the previously marginalized terms.

  3. In the DSO paper, two references were mentioned when discussing FEJ, "Keyframe-based visual-inertial odometry using nonlinear optimization" and "A First-Estimates Jacobian EKF for Improving SLAM Consistency". First ref advocated using the estimates at the marginalization point to fix the linearization point, which is the scenario we've been discussing so far. However, the second ref advocated "using the first-ever available estimates". This philosophy seems to be more aligned with the implementation I mentioned in my 1st argument, i.e., fixing the frame's linearization point immediately after adding it as a key frame.

In summary, there seems to be two different interpretation for FEJ: fixing the linearization point of variables that are related to marginalized variables after marginalization, or fixing the linearization point of some variables after literally their first estimate. I can't decide which one was applied in DSO...

tanjunyao7 commented 3 years ago
  1. Right after the new keyframe is optimized in the backend, some old key frame will be marginalized in general case, after which all the rest keyfame's(including the new one) linearization point should be fixed. If we calculate the marginalization factor using the current estimate of the new frame and after that set the linearization point equal to the current estimate, then it's the same as setting the linearization point before marginalization.

This specific frame most probably won't be marginalized immediately, at least not until the sliding window contained enough keyframes or there's a sudden camera turn or brightness changes.

Yes, but it is connected to the one which will be marginalized immediately. Its state estimate will be immediately introduced in the marginalization factor. So it's needed to fix it here.

  1. I guess the author keeps unused variables in the code to remind readers how the optimization should be performed , especially the rules of FEJ. In DSO's implementation, all the points are marginalized before their host frame is marginalized. What if one wants to marginalize only the frame while keeps its points(even though nobody does this since it destroys the sparsity pattern of the hessian structure)? In this case, the estimate and linearization point of point depth will be different.

The former was used to compute Jacobian of the pose, idepth and camera parameters, and the latter of the affine parameters > > and JI.

No, the latter was used to compute JI and residual. $J_{photo}$ is calculated by float drdA = (color[idx]-b0); where b0 is the linearization point of affine parameters.

  1. I don't know how filter-based slams do marginalization. In fact in DSO a keyframe's linearization point is set twice, once after the coarse tracking and once after the backend optimization. Your statement "fixing the frame's linearization point immediately after adding it as a key frame" is the same as "fixing the frame's linearization point immediately before it's included in the marginalization factor". I have explained it in 1.

by the way, my first comments that idepth_zero corresponds to prior idepth information is wrong.

amberwood31 commented 3 years ago

Thanks a lot! That was very helpful!

  1. For the frame marginalization, I was speaking from my observation of ICL dataset, where frames don't get marginalized until the sliding window had 7 key frames. Also the setting_minFrames was set to 5 by default, so no frame would be marginalized until that threshold was reached. But I guess you're right that for other dataset, if the setting was changed, it's possible that some old frame could be marginalized right away. It depended on a bunch of heuristic conditions after all.

  2. You're right. I obviously missed the b0 definition. Thanks for pointing that out.

  3. I'm also not familiar with the filter-based SLAM, just came across that paper from DSO's references.

I have another question related to the mode 1 of `AccumulatedTopHessianSSE::addPoint'. Hopefully you can also share your insight https://github.com/JakobEngel/dso/issues/226