MathMarEcol / pdyer_aus_bio

GNU General Public License v3.0
0 stars 0 forks source link

Fix numerical issues in matrix determinants #25

Open PhDyellow opened 1 year ago

PhDyellow commented 1 year ago

predict_gfbootstrap.R will occasionally fail when all determinants come back as 0.

Mathematically that would happen if

  1. The matrix is not full rank: some rows/cols are linear combinations of other rows/cols.
  2. Some columns or rows are entirely 0.
  3. The matrix is not invertible for some reason.

It can also happen for numerical reasons.

I have examined a few matricies, and the matrices were full rank, and no rows or cols were all 0.

I suspect numerical issues. I increased the numerical resolution to doubles, but the problem remained.

Because I know the matrix is supposed to be symmetric, I can use linalg_eigh which did succeed in returning the eigenvectors and eigenvalues.

mat_eig <- torch::linalg_eigh(mat)
inv <- mat_eig[[2]]$multiply(mat_eig[[1]])$multiply(mat_eig[[2]]$transpose(1,2))
det <- mat_eig[[1]]$prod()

However, this is much slower, and not well integrated into the rest of the code, so for now surveys that have this problem will just return NA, not fitted.

PhDyellow commented 9 months ago

This issue was most likely caused by only bootstrapping 10 times for testing.

trying to estimate a covariance matrix among 29 dimensions with only 10 samples is underdetermined.

Numerical errors were probably the only reason some matrices worked at all.

Further testing suggested that around 300 bootstraps were needed before the number of bioregions became uncorrelated with the number of bootstraps.