Error for PRICE - Githubissues

charlin90 commented 5 years ago

Hello, I am using PRICE software, it is really a wonderful tool, I have used it to predict ORF of several samples and it works well. Unfortunately, now I encounters a problem when analyzed a sample. I don't know what the error is, and I really need your help. The error messages are as follows:

2019-02-20 09:42:36.495 INFO Command: gedi -e Price -reads shNC_ribo_rep2.sort.bam -genomic /media/nbfs/nbCloud/public/nbcplatform/genome/index/PRICE/9606/GRCh38_Ensembl91/PRICE/homo_ens91.oml -prefix /media/nbfs/UntitledFolder/2018/20181225/159.226.118.232/upload/2018_projects/circshMY_riboseq/Riboseq/tmp/shNC -progress -D 2019-02-20 09:42:36.656 INFO Discovering classes in classpath 2019-02-20 09:42:36.827 INFO Preparing simple class references 2019-02-20 09:42:36.946 INFO Gedi 1.0.2 (JAR) startup 2019-02-20 09:42:37.117 INFO Reading oml /media/nbfs/nbCloud/public/nbcplatform/genome/index/PRICE/9606/GRCh38_Ensembl91/PRICE/homo_ens91.oml 2019-02-20 09:42:37.122 INFO Done reading oml /media/nbfs/nbCloud/public/nbcplatform/genome/index/PRICE/9606/GRCh38_Ensembl91/PRICE/homo_ens91.oml 2019-02-20 09:42:38.118 INFO Estimating maxpos... 2019-02-20 09:42:38.124 INFO Clustering reads Processed 230281 elements in 5m 17s 469ms (Throughput: 725.4/sec) Processed 230650 elements in 6m 7s 306ms (Throughput: 628.0/sec) 2019-02-20 09:48:45.580 INFO Using maxpos=10 (Merged) 2019-02-20 09:48:45.581 INFO Estimate parameters Processed 230650 elements in 6m 11s 385ms (Throughput: 621.1/sec) | Model inference: Repeat 1672, Iteration 141: LL=-430153; bestLL=-429788 5066150 (1141539.0/sec)2019-02-20 09:48:50.059 INFO Found 67919 clusters without looking at annotation Processed 67919 elements in 962ms (Throughput: 70601.9/sec) 2019-02-20 09:48:51.077 INFO Found 17751 clusters after looking at annotation | Model inference: Repeat 2464, Iteration 3: LL=-442025; bestLL=-429780 6563520 (1163332.3/sec)2019-02-20 09:48:51.238 INFO Found 6837 clusters after filtering with read/region count Processed 6648856 elements in 6s 43ms (Throughput: 1100257.5/sec) 2019-02-20 09:48:51.725 INFO LL=-429757 2019-02-20 09:48:51.738 INFO Codon inference Processed 6837 elements in 2m 28s 785ms (Throughput: 46.0/sec) 2019-02-20 09:51:20.625 INFO Train start prediction 2019-02-20 09:51:20.626 INFO Writing codon index 2019-02-20 09:51:20.626 INFO Calibrate noise model

| Start prediction training 17+:39200290-39204648 300 (329.7/sec) Processed 408 elements in 1s 4ms (Throughput: 406.4/sec) Processed 6812 elements in 1s 4ms (Throughput: 6784.9/sec) | Viewer indices X+:11121643-11121646 88990 (88283.7/sec)[GediProgram-1-thread-9] INFO smile.math.Math - L-BFGS: initial function value: 1696.8 [GediProgram-1-thread-9] INFO smile.math.Math - L-BFGS: the function value after 10 iterations: 1089.9 [GediProgram-1-thread-9] INFO smile.math.Math - L-BFGS: initial function value: 1696.8 [GediProgram-1-thread-9] INFO smile.math.Math - L-BFGS: the function value after 10 iterations: 1043.7 [GediProgram-1-thread-9] INFO smile.math.Math - L-BFGS: the function value after 13 iterations: 1043.6 2019-02-20 09:51:21.689 INFO Infer ORFs

Viewer indices GL000220.1-:156133-156136 91089 (69427.6/sec)
PRICE PRICE is an analysis method for Ribo-seq data.

Error: java.lang.RuntimeException: Could not run gedi.riboseq.javapipeline.PriceOrfInference@66307bb4 Could not run gedi.riboseq.javapipeline.PriceOrfInference@66307bb4 Exception in parallel iterator thread! 0

gedi -e PRICE

General: -reads The mapped reads from the ribo-seq experiment. -prefix The prefix used for all output files -genomic The indexed GEDI genome.

Commandline: -progress Show progress -D Verbose output of errors -h Show usage -hh Show verbose usage -hhh Show extra verbose usage -dry Dry run -keep Do not remove temp files

java.lang.RuntimeException: java.lang.RuntimeException: Could not run gedi.riboseq.javapipeline.PriceOrfInference@66307bb4 at gedi.util.job.schedule.DefaultPetriNetScheduler.run(DefaultPetriNetScheduler.java:223) at gedi.util.program.GediProgram$1.execute(GediProgram.java:359) at gedi.util.program.GediProgram.run(GediProgram.java:204) at executables.Price.main(Price.java:69) Caused by: java.lang.RuntimeException: Could not run gedi.riboseq.javapipeline.PriceOrfInference@66307bb4 at gedi.util.program.GediProgramJob.execute(GediProgramJob.java:63) at gedi.util.program.GediProgramJob.execute(GediProgramJob.java:27) at gedi.util.job.FireTransition.call(FireTransition.java:54) at gedi.util.job.FireTransition.call(FireTransition.java:25) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Exception in parallel iterator thread! at gedi.util.functions.ParallelizedIterator.fatal(ParallelizedIterator.java:229) at gedi.util.functions.ParallelizedIterator.tryNext(ParallelizedIterator.java:197) at gedi.util.functions.ParallelizedIterator.hasNext(ParallelizedIterator.java:158) at java.util.Iterator.forEachRemaining(Iterator.java:115) at gedi.riboseq.utils.RiboUtils.lambda$processCodonsSink$4(RiboUtils.java:107) at gedi.riboseq.utils.RiboUtils.processCodons(RiboUtils.java:147) at gedi.riboseq.utils.RiboUtils.processCodonsSink(RiboUtils.java:107) at gedi.riboseq.javapipeline.PriceOrfInference.execute(PriceOrfInference.java:118) at gedi.util.program.GediProgramJob.execute(GediProgramJob.java:61) ... 7 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at jdistlib.math.spline.SmoothSpline.findDuplicateIndices(SmoothSpline.java:1508) at jdistlib.math.spline.SmoothSpline.fit(SmoothSpline.java:1392) at jdistlib.math.spline.SmoothSpline.fitDFMatch(SmoothSpline.java:1296) at gedi.riboseq.inference.orf.NoiseModel.computeMeanSpline(NoiseModel.java:188) at gedi.riboseq.inference.orf.NoiseModel.getProbability(NoiseModel.java:141) at gedi.riboseq.inference.orf.NoiseModel.getProbability0(NoiseModel.java:121) at gedi.riboseq.inference.orf.OrfInference.inFrameTestCandidate(OrfInference.java:1271) at gedi.riboseq.inference.orf.OrfInference.inFrameTest(OrfInference.java:1079) at gedi.riboseq.inference.orf.OrfInference.inferOrfs(OrfInference.java:433) at gedi.riboseq.javapipeline.PriceOrfInference.lambda$null$1(PriceOrfInference.java:121) at gedi.util.FunctorUtils$DemultiplexIterator.lookAhead(FunctorUtils.java:106) at gedi.util.FunctorUtils$DemultiplexIterator.hasNext(FunctorUtils.java:90) at gedi.util.FunctorUtils$PeekIterator.hasNext(FunctorUtils.java:638) at gedi.util.FunctorUtils$SideEffectIterator.hasNext(FunctorUtils.java:719) at gedi.util.functions.ExtendedIterator.toCollection(ExtendedIterator.java:400) at gedi.util.functions.ParallelizedIterator$Worker.run(ParallelizedIterator.java:341)

Look forward to your reply!

Best, Charlin

florianerhard commented 5 years ago

Dear Charlin, first, your interest is much appreciated.

In this dataset there are extremely few reads, am I right. Price does not work that well with extremely sparse data at the moment. Here the problem is that it tries to fit it's noise model (i.e. to get an idea of the amount of reads mapped to off-frame codons for ORF inference). It does not succeed because there are too few data available.

Is this the only sample from this experiment? If not, run them all at the same time (howto), which is the recommended mode to run Price!

Best, Florian

charlin90 commented 5 years ago

Hi Florian, Thanks for your reply. I am confused for "extremely few reads" that you said. In fact, my bam file size is 1.36 GB. In our previous datasets, some bam file size is even lower than 1G, while PRICE still works well.

Moreover, I have two replicate samples, and I have run them at the same time before, but PRICE still gave the same errors.

Best, Charlin

erhard-lab / price

Error for PRICE #10