covariance error choices

mobeets commented 7 years ago

From this paper [section 5 here], it seems that the Riemannian metric for covariance matrices (from here) is ideal for many reasons:

invariant to rotations
invariant to scaling of the data
very common

The current version I'm using (from here, by Garcia) only hits the first point. It is not invariant to scaling, which probably explains why some sessions have outliers in terms of their covariance error. It's also not very commonly used.

Two problems with the Riemannian metric, though, which the Garcia version doesn't have as problems, are:

sensitive to rank-deficient covariances
non-intuitive

mobeets commented 7 years ago

here's a good implementation of other options.

mobeets commented 7 years ago

well, as far as dealing with rank-deficiency, in practice I run:

es = eig(nancov(D1), nancov(D2), 'qz');
err = real(sqrt(sum(log(es).^2)));

where the qz and real parts make it stable.

mobeets commented 7 years ago

to make this new version more intuitive: consider comparing a covariance matrix C to the unit circle covariance, I.

you might be tempted at first to use trace or det. but think about all matrices with the same trace, or det, as I: they can be really long and skinny as long as the sum (trace) or product (det) of eigenvals are 1.

instead, we should like to have equal ratios of eigenvals, so we take the log of the eigenvals, and then we square so that deviations in either direction are treated equally: sum_i ln(λ_i)^2

and this is the metric, only we use the generalized eigenvals of two matrices A and B, and we take the sqrt at the very end.

mobeets / nullSpaceControl

covariance error choices #374