Closed travc closed 8 years ago
The error is saying that the parametric covariance matrix can't be inverted. This has happened in a few runs I've done. Basically the geogenetic distance between a pair (or more) of your samples become too small, with very small nugget parameters as well, and the parametric covariance comes close to being degenerate. At that point, the calculation of the likelihood becomes numerically unstable, and the train kind of goes off the tracks (e.g., see how large the likelihood values are leading up to the MCMC breaking).
There are a few possible fixes: 1) Because this is a stochastic algorithm, it's possible that if you run the analysis again it won't happen again (fingers crossed!).
2) If the offending samples are, in fact, very genetically similar you could consider dropping on or the other, or lumping them into a single sample by summing their rows in the 'ac' matrix and the 'ss' matrix.
3) You could consider running with the "round.earth" option set to FALSE. Your samples are all from quite close to one another, so the curvature of the Earth probably isn't that big a deal, and it's possible estimating locations on a sphere is contributing to this issue.
Whatever option you choose, I'd recommend doing something to try to catch the output if the run fails (e.g., tryCatch, options(error=recover) if it's an interactive session, etc.) to help diagnose the problem.
Suggestion: when this happens, catch the error and save out the relevant information (current state, seed?).
On Wed, Jun 1, 2016 at 1:02 PM, gbradburd notifications@github.com wrote:
The error is saying that the parametric covariance matrix can't be inverted. This has happened in a few runs I've done. Basically the geogenetic distance between a pair (or more) of your samples become too small, with very small nugget parameters as well, and the parametric covariance comes close to being degenerate. At that point, the calculation of the likelihood becomes numerically unstable, and the train kind of goes off the tracks (e.g., see how large the likelihood values are leading up to the MCMC breaking).
There are a few possible fixes: 1) Because this is a stochastic algorithm, it's possible that if you run the analysis again it won't happen again (fingers crossed!).
2) If the offending samples are, in fact, very genetically similar you could consider dropping on or the other, or lumping them into a single sample by summing their rows in the 'ac' matrix and the 'ss' matrix.
3) You could consider running with the "round.earth" option set to FALSE. Your samples are all from quite close to one another, so the curvature of the Earth probably isn't that big a deal, and it's possible estimating locations on a sphere is contributing to this issue.
Whatever option you choose, I'd recommend doing something to try to catch the output if the run fails (e.g., tryCatch, options(error=recover) if it's an interactive session, etc.) to help diagnose the problem.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gbradburd/SpaceMix/issues/10#issuecomment-223107723, or mute the thread https://github.com/notifications/unsubscribe/AA_26cIORzrSzWBRkEMo9gxCqLRsAVC7ks5qHeVGgaJpZM4Ir1AL .
Thanks for the explanation and suggestions. Makes sense. Feel free to close unless you want to add in some more explicit error catching for the condition.
I've got no clue what caused this error. It came up rather late into the LongRun after successfully doing 10 fast runs. The data set is 100K SNPs in something like 45 samples.
The command was:
data.gzip.zip