rpng / open_vins

An open source platform for visual-inertial navigation research.
https://docs.openvins.com
GNU General Public License v3.0
2.13k stars 630 forks source link

Questions about Mahalanobis distance check #215

Closed lewisjiang closed 2 years ago

lewisjiang commented 2 years ago

Dear authors,

I have some questions about the Mahalanobis distance check (or chi square check) for feature rejection in OpenVINS. If I understand it correctly, the process is something like this in your code:

// 1). measured pixel location minus estimated pixel location
res = r_meas - r_est; 

// 2). normalize to N(0,1) and take squares
chi2 = res.transpose() * (H_x * P_x * H_x.transpose() + R).inverse() * res;  

// 3). compare with ch2 table, etc..
...

My question is this. On the one hand, in a typical vins system without global measurements and revisiting old places, the covariance of the state will grow without bound, i.e. P_x in 2) will grow "larger". On the other hand, we know from practice that the visual measurement residuals res in 1) will only be a few pixels no matter how the state covariance grows. As a result, chi2 in 2) will be smaller and smaller, making the system almost unlikely to reject any features as time passes by.

In my humble opinion, the x related terms should be dropped from 2). The theory I can think of to back my point is Bayesian inference. Basically when we take a measurement, we do not draw a sample from $p(y)$, but from $p(y|x)$, and the residual (or similarly innovation) term should conform to a Gaussian distribution with covariance R. Hope to get your comment on this explanation.

I also read about this distance check technique in several papers from your group (e.g. LIC-fusion, MINS). I didn't find any explanation on why the middle term on the rhs of 2) should be like that. I guess it's directly propagated from $z = H_x \delta x + n$.

Hope to get your reply. Thanks!

goldbattle commented 2 years ago

Hi. This comes from traditional Kalman filter "gating" tests. The way I like to think of it (might be wrong) is in terms of the normalized estimation error squared (NEES). Your current best guess of the "error" is the difference between the measurement and predicted. This error is expected to be within the combined uncertainty of your state (ie. your HPH^t) and the measurement itself (ie. measurement noise R). These two combined should capture the error, and if it isn't then you should reject it as the outlier.

The HPH^T should be there, as your predicted measurement can be extremely wrong if your state is bad.

Additionally, remember that even if your state uncertainty (pose in the global frame) is very large, this does not mean that your HPH^T will be large. For example, if the measurement is a camera bearing from two frames, then the global uncertainty shouldn't matter as it is just the relative uncertainty between the two that matters. This is directly calculated through your HPH^T which is propagates the global error to the measurement error.

lewisjiang commented 2 years ago

Thank you for the reply, I see where my problem is.