gbradburd / SpaceMix

29 stars 10 forks source link

"system computationally singular" crash #10

Closed travc closed 8 years ago

travc commented 8 years ago

I've got no clue what caused this error. It came up rather late into the LongRun after successfully doing 10 fast runs. The data set is 100K SNPs in something like 45 samples.

...
652800  ----  8.926476e+14 
652900  ----  8.926476e+14 
653000  ----  8.926476e+14 
653100  ----  8.926476e+14 
Error in solve.default(par.cov) : 
  system is computationally singular: reciprocal condition number = 9.67127e-17

The command was:

require(SpaceMix)

# load data
# defines ac (allele_counts), ss (sample_size), and loc (lng, lat) matricies
load('data.gzip')
# Need to transpose
ac <- t(ac)
ss <- t(ss)

run.spacemix.analysis(n.fast.reps = 10,
                        fast.MCMC.ngen = 1e5,
                        fast.model.option = "target",
                        long.model.option = "target",
                        data.type = "counts",
                        sample.frequencies=NULL,
                        mean.sample.sizes=NULL,
                        counts = ac, # our data
                        sample.sizes = ss, # our data
                        sample.covariance=NULL,
                        target.spatial.prior.scale=NULL,
                        source.spatial.prior.scale=NULL,
                        spatial.prior.X.coordinates = loc[,1], # our data
                        spatial.prior.Y.coordinates = loc[,2], # our data
                        round.earth = TRUE, # changed
                        long.run.initial.parameters=NULL,
                        k = nrow(ac), # our data
                        loci = ncol(ss), # our data
                        ngen = 1e6,
                        printfreq = 1e2,
                        samplefreq = 1e3,
                        mixing.diagn.freq = 50,
                        savefreq = 1e5,
                        directory=NULL,
                        prefix = outprefix)

data.gzip.zip

gbradburd commented 8 years ago

The error is saying that the parametric covariance matrix can't be inverted. This has happened in a few runs I've done. Basically the geogenetic distance between a pair (or more) of your samples become too small, with very small nugget parameters as well, and the parametric covariance comes close to being degenerate. At that point, the calculation of the likelihood becomes numerically unstable, and the train kind of goes off the tracks (e.g., see how large the likelihood values are leading up to the MCMC breaking).

There are a few possible fixes: 1) Because this is a stochastic algorithm, it's possible that if you run the analysis again it won't happen again (fingers crossed!).

2) If the offending samples are, in fact, very genetically similar you could consider dropping on or the other, or lumping them into a single sample by summing their rows in the 'ac' matrix and the 'ss' matrix.

3) You could consider running with the "round.earth" option set to FALSE. Your samples are all from quite close to one another, so the curvature of the Earth probably isn't that big a deal, and it's possible estimating locations on a sphere is contributing to this issue.

Whatever option you choose, I'd recommend doing something to try to catch the output if the run fails (e.g., tryCatch, options(error=recover) if it's an interactive session, etc.) to help diagnose the problem.

petrelharp commented 8 years ago

Suggestion: when this happens, catch the error and save out the relevant information (current state, seed?).

On Wed, Jun 1, 2016 at 1:02 PM, gbradburd notifications@github.com wrote:

The error is saying that the parametric covariance matrix can't be inverted. This has happened in a few runs I've done. Basically the geogenetic distance between a pair (or more) of your samples become too small, with very small nugget parameters as well, and the parametric covariance comes close to being degenerate. At that point, the calculation of the likelihood becomes numerically unstable, and the train kind of goes off the tracks (e.g., see how large the likelihood values are leading up to the MCMC breaking).

There are a few possible fixes: 1) Because this is a stochastic algorithm, it's possible that if you run the analysis again it won't happen again (fingers crossed!).

2) If the offending samples are, in fact, very genetically similar you could consider dropping on or the other, or lumping them into a single sample by summing their rows in the 'ac' matrix and the 'ss' matrix.

3) You could consider running with the "round.earth" option set to FALSE. Your samples are all from quite close to one another, so the curvature of the Earth probably isn't that big a deal, and it's possible estimating locations on a sphere is contributing to this issue.

Whatever option you choose, I'd recommend doing something to try to catch the output if the run fails (e.g., tryCatch, options(error=recover) if it's an interactive session, etc.) to help diagnose the problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gbradburd/SpaceMix/issues/10#issuecomment-223107723, or mute the thread https://github.com/notifications/unsubscribe/AA_26cIORzrSzWBRkEMo9gxCqLRsAVC7ks5qHeVGgaJpZM4Ir1AL .

travc commented 8 years ago

Thanks for the explanation and suggestions. Makes sense. Feel free to close unless you want to add in some more explicit error catching for the condition.