Closed carden24 closed 9 years ago
The model parameters are adjusted depending on the coverage, and -L 25 is the least reliable parameter set (although in some cases is the ONLY option). Can I have a copy of the .npo file? I would like to take a look at the curve.
Thanks! Miguel.
here is one of the filles. https://www.dropbox.com/s/52dtnanhi1tipc2/A8-OM0C0-O3.npo?dl=0
I believe the problem is insufficient sampling at lower sequencing effort. It can be solved by re-running nonpareil with "-d 0.7". This parameter will turn on "logarithmic sampling", which will uniformly subsample in logarithmic space (as opposed to linear space, the default). The documentation indicates that this is experimental code, but I've tested it in a large array of datasets and I'm confident it's a stable feature. This subsampling should be preferred, and I attempt to make it the default in the next release. Please let me know if this solves the convergence failure.
Running the sampling at the logarithmic mode solved the convergence issue. Here is a comparison of the of the results. Performance was similar (resources) and the results are comparable (I do not expect the exact same results due to sampling). thanks
Erick
Sampling method | Kappa | Coverage | LRstar | LR | ModelR | Diversity |
---|---|---|---|---|---|---|
Log sampling | 0.37363 | 0.4352244 | 1.13E+11 | 8470238989 | 0.9994858 | 23.00952 |
Linear sampling | 0.38015 | 0.4416335 | 0 | 8470238989 | 0 | 0 |
I have similar problem and even if I used -d 0.7 I can't generate curves. Also I tried running it with larger query size, but that only worked for some of the samples (I have 5 samples in total). I get this message in R for samples that have low coverage "Median of the curve is zero at 20% of the reads, check parameters and re-run (e.g., decrease value of -L in nonpareil)." I tried to run one of the samples in your online version of Nonpareil and here is the result http://enve-omics.ce.gatech.edu/nonpareil/results?jid=546c987b1372f Any ideas? I could also send you my .npo files.
Thank you, Kristjan
@koopkaup your dataset seems to be too small for Nonpareil to accurately project the coverage. The coverage is still estimated (14.65% in the link above) and you can still visualize the subsampled curve (in the link above select "Plot curve only", or in the R interface set "plotModel=F"). However, diversity and projected sequencing effort are both unavailable without an accurate model.
Solved in v2.400
I have problems making the nonpareil curves with R. I run mpi nonpareil as usual except I use the -L 25 options because I expect poor coverage. I also know from pyrotag data that I have closely related species in my samples. There was no errors in running nonpareil and I get "everything seems correct result' However when I try to get the curves I get the following warning messages:
Warning messages: ...Convergence failure: false convergence (8) ...Model didn't converge
My questions are: -Is my coverage result still valid? -Do I need to rerun nonpareil with different paramaters? or the modelling of the curves only affects the sequencing effort needed.
thanks, Erick