strange marginalization factor added to windowed optimization

JakobEngel / dso

Direct Sparse Odometry

GNU General Public License v3.0

2.27k stars 906 forks source link

strange marginalization factor added to windowed optimization #224

Open tanjunyao7 opened 4 years ago

tanjunyao7 commented 4 years ago

Hi, according to the paper, the marginalized factor can be represented by

E = 2x^(T)b + x^(T)Hx

where b and H correspond to bM and HM in the code, respectively, and x is the current so called (state_minus_state_zero). This can be proved by the member function calcMEnergyF() of class EnergyFunctional, which should compute the energy of the marginalized states: double EnergyFunctional::calcMEnergyF() { assert(EFDeltaValid); assert(EFAdjointsValid); assert(EFIndicesValid);

VecX delta = getStitchedDeltaF(); return delta.dot(2*bM + HM*delta); }

This implies that the Gauss-Newton system of the marginalized energy can be represented by :

HM*delta = bM

Therefore we should add HM to HFinal_top and bM to bFinal_top when solving the total system,respectively, shouldn't we? However, in the code what goes is HFinal_top = HT_act + HM; bFinal_top = bT_act + bM_top; with bM_top = (bM+ HM * getStitchedDeltaF()); instead of bM_top = bM;

It not clear to me what HM * getStitchedDeltaF() describes. Anyone has an idea? @NikolausDemmel @JakobEngel

studennis911988 commented 4 years ago

Since DSO applies FEJ to the system, the marginalized Hessian (HM) is kept at the same linearlization point, but we could still use first order Tayler expansion to keep updating bM by bM_top = (bM+ HM * getStitchedDeltaF());

NikolausDemmel commented 4 years ago

Yes, see also the paragraph just before "V. VISUAL-INERTIAL MAPPING" in this paper, which describes the same idea w.r.t. marginalization: https://arxiv.org/pdf/1904.06504.pdf

tanjunyao7 commented 4 years ago

thanks @studennis911988 and @NikolausDemmel . It's a good idea to update the vector b using H.

biggiantpigeon commented 2 years ago

I still have a question, it looks like bM_top = (bM+ HM * getStitchedDeltaF()); update only the b part in b'(see equation below), but what about the Hx0 part? what is x0? I think x0 is the increment in this optimization(in paper:where x0 denotes the current value (evaluation point for r)of x), so it should not be zero, and will change in every optimize iteration. However, according to the code in fixLinearizationF of calculating res_toZeroF, I guess the -Hx0 part is done in res_toZeroF, in the last optimization. But how do you know the increment in this optimization by the time in last optimization? This has puzzled me many days! Can someone help me with it? @tanjunyao7 @NikolausDemmel @studennis911988

tanjunyao7 commented 2 years ago

I still have a question, it looks like bM_top = (bM+ HM * getStitchedDeltaF()); update only the b part in b'(see equation below), but what about the Hx0 part? what is x0? I think x0 is the increment in this optimization(in paper:where x0 denotes the current value (evaluation point for r)of x), so it should not be zero, and will change in every optimize iteration. However, according to the code in fixLinearizationF of calculating res_toZeroF, I guess the -Hx0 part is done in res_toZeroF, in the last optimization. But how do you know the increment in this optimization by the time in last optimization? This has puzzled me many days! Can someone help me with it? @tanjunyao7 @NikolausDemmel @studennis911988

x0 is the fixed linearization point, meaning it's a constant. H is calculated from x0, also constant. b is J*r, where jacobian J is constant but residual r not. that's why terms containing only x0 and H are grouped together into the c' term.

biggiantpigeon commented 2 years ago

I still have a question, it looks like bM_top = (bM+ HM * getStitchedDeltaF()); update only the b part in b'(see equation below), but what about the Hx0 part? what is x0? I think x0 is the increment in this optimization(in paper:where x0 denotes the current value (evaluation point for r)of x), so it should not be zero, and will change in every optimize iteration. However, according to the code in fixLinearizationF of calculating res_toZeroF, I guess the -Hx0 part is done in res_toZeroF, in the last optimization. But how do you know the increment in this optimization by the time in last optimization? This has puzzled me many days! Can someone help me with it? @tanjunyao7 @NikolausDemmel @studennis911988

x0 is the fixed linearization point, meaning it's a constant. H is calculated from x0, also constant. b is J*r, where jacobian J is constant but residual r not. that's why terms containing only x0 and H are grouped together into the c' term.

Thanks for reply so soon!

I agree with you that J and H is constant and r is not. But I'm not sure x0 is constant.

Constant or not is related with part of process. H and J is constant in a whole optimization for FEJ parameters, but the c' term is 'constant' because it did not take part in the derivative of E', so you can neglect it when optimizing E' in one iteration, but c' is not constant all the time, otherwise how can b be in it? If c' is not constant, x0 may be not too.

Let's suppose you are right that x0 is the fixed linearization point, can you be nice to explain to me the calculation of res_toZeroF in fixLinearizationF()? I can see res_toZeroF = resF - J delta, where J / resF is the new jacobian / residual after this optimization, delta is the accumulated increment in this optimization, and bM = J res_toZeroF, so the residual roll back to the beginning of this optimization, and then this bM is used for the next optimization? I don't think this through.

I'm still a beginner in SLAM, please correct me if I'm wrong. Thanks again.

NikolausDemmel commented 2 years ago

I don't know the details in the DSO code, but just a few general comments: "delta" should be the difference between the linearization point and current state. This is constant within on GN iteration (same as H and b). That is why we can ignore the c / c' terms, when solving the linear system, since they do not affect the minimizer. delta does change across iterations, so we do need to take it into account for computing b' in every iteration. But we do not need to keep track of the constant energy terms. (One place where you have to be careful about this is when comparing the cost in one LM iteration before and after applying the GN update, to check if the cost decreased in order to update the LM damping.) Note that dropping c means that the computed marginalization energy could even be negative (but that's fine).

Maybe these code comments in Basalt can help to clarify (it might be slightly different notation, but otherwise the principle of marginalization prior and FEJ should be the same as in DSO). This part in the code computes the contribution of the marg. prior (stored as H and b at linearization point, i.e. delta==0) to current normal equations (H and b) and also the current cost. Consider only the branch mld.is_sqrt == false: https://gitlab.com/VladyslavUsenko/basalt/-/blob/6d55fdab9ff2e6901dbcab634cf5025c58ad7371/src/vi_estimator/ba_base.cpp#L424-453

I hope this helps a bit.