PhilBoileau / cvCovEst

An R package for assumption-lean covariance matrix estimation in high dimensions
https://philboileau.github.io/cvCovEst/
Other
13 stars 4 forks source link

Nonlinear shrinkage estimator breaks in high n and p #23

Closed nhejazi closed 4 years ago

nhejazi commented 4 years ago

In larger sample sizes, some estimators appear to face some degree of instability, e.g., in a simple 400 x 400 dataset, the nonlinear shrinkage estimator fails with

r$> nlShrinkLWEst(data_in)
Error in u %*% diag(d_tilde) : non-conformable arguments

This is not such a big problem but points out a design safety issue: when this estimator is included in the selector library, the whole cross-validated selector fails. Instead, we should add in some safety guards so that a single (or even multiple) failing estimators do not bring the whole selector down, that is, the CV selection procedure should return the optimal choice among estimators that can be fit.

PhilBoileau commented 4 years ago

Agh, that’s my bad. I must have messed up a control statement somewhere in nlShrinkLWEst. I’ll look into it.

That’s a good point regarding error handling. Do you have suggestions on how to best implement this? Any examples we can pull from one of your other packages?

PhilBoileau commented 4 years ago

It looks like the issue is on line 354 of ‘R/estimators.R’. Instead of if (p<= n) {, I think the line should read if (p <= eig_nonzero_tol) {. I can’t confirm that this is the cause without testing, though.

nhejazi commented 4 years ago

Confirmed that this fix works and patched in https://github.com/PhilBoileau/cvCovEst/pull/24. Will merge as soon as CI checks are complete.

nhejazi commented 4 years ago

Resolved by #24