Closed mobeets closed 7 years ago
well, as far as dealing with rank-deficiency, in practice I run:
es = eig(nancov(D1), nancov(D2), 'qz');
err = real(sqrt(sum(log(es).^2)));
where the qz
and real
parts make it stable.
to make this new version more intuitive: consider comparing a covariance matrix C to the unit circle covariance, I.
you might be tempted at first to use trace or det. but think about all matrices with the same trace, or det, as I: they can be really long and skinny as long as the sum (trace) or product (det) of eigenvals are 1.
instead, we should like to have equal ratios of eigenvals, so we take the log of the eigenvals, and then we square so that deviations in either direction are treated equally: sum_i ln(λ_i)^2
and this is the metric, only we use the generalized eigenvals of two matrices A and B, and we take the sqrt at the very end.
From this paper [section 5 here], it seems that the Riemannian metric for covariance matrices (from here) is ideal for many reasons:
The current version I'm using (from here, by Garcia) only hits the first point. It is not invariant to scaling, which probably explains why some sessions have outliers in terms of their covariance error. It's also not very commonly used.
Two problems with the Riemannian metric, though, which the Garcia version doesn't have as problems, are: