broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

Enquiry for pilon running virtual memory #53

Closed zx0223winner closed 7 years ago

zx0223winner commented 7 years ago

Hi Pilon supporting staff,

I have problem about running pilon, the report is listed as follows: the genome size is 250mb, I am not sure if it is due to the java virtual memory is not enough.

[xzha25@iqaluk xzha25]$ java -Xmx350G -jar /work/xzha25/pilon/pilon_2.11-1.21-one-jar.jar --genome /work/xzha25/CBW_uwo241_test/UWO241_gDNA4_19cells_3Hiseq_hybrid_Masurca_scaffolds.fasta --frags /scratch/xzha25/illumina_uwo241_hiseq_pairedend_to_hybrid_Masurca_scaffolds.sorted.bam --output /work/xzha25/CBW_uwo241_test/hybrid_Masurca_scaffolds-pilon Pilon version 1.21 Fri Dec 9 16:44:44 2016 -0500 Genome: /work/xzha25/CBW_uwo241_test/UWO241_gDNA4_19cells_3Hiseq_hybrid_Masurca_scaffolds.fasta Fixing snps, indels, gaps, local Input genome size: 212383984 Scanning BAMs /scratch/xzha25/illumina_uwo241_hiseq_pairedend_to_hybrid_Masurca_scaffolds.sorted.bam: 166147958 reads, 0 filtered, 158730221 mapped, 0 proper, 0 stray, Unpaired 100% 99+/-11, max 132 Processing scf7180000011305:1-98240 frags /scratch/xzha25/illumina_uwo241_hiseq_pairedend_to_hybrid_Masurca_scaffolds.sorted.bam: coverage 58 Total Reads: 72141, Coverage: 58, minDepth: 6 Confirmed 95044 of 98240 bases (96.75%) Corrected 6 snps; 0 ambiguous bases; corrected 5 small insertions totaling 9 bases, 1 small deletions totaling 4 bases Attempting to fix local continuity breaks Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.simontuffs.onejar.Boot.run(Boot.java:340) at com.simontuffs.onejar.Boot.main(Boot.java:166) Caused by: java.lang.IllegalStateException: Inappropriate call if not paired read at htsjdk.samtools.SAMRecord.requireReadPaired(SAMRecord.java:648) at htsjdk.samtools.SAMRecord.getFirstOfPairFlag(SAMRecord.java:706) at org.broadinstitute.pilon.BamFile$MateMap.addRead(BamFile.scala:224) at org.broadinstitute.pilon.BamFile$MateMap$$anonfun$addReads$1.apply(BamFile.scala:220) at org.broadinstitute.pilon.BamFile$MateMap$$anonfun$addReads$1.apply(BamFile.scala:220) at scala.collection.immutable.List.foreach(List.scala:381) at org.broadinstitute.pilon.BamFile$MateMap.addReads(BamFile.scala:220) at org.broadinstitute.pilon.BamFile$MateMap.(BamFile.scala:218) at org.broadinstitute.pilon.BamFile.recruitFlankReads(BamFile.scala:339) at org.broadinstitute.pilon.GapFiller$$anonfun$recruitReadsOfType$1.apply(GapFiller.scala:367) at org.broadinstitute.pilon.GapFiller$$anonfun$recruitReadsOfType$1.apply(GapFiller.scala:366) at scala.collection.immutable.List.foreach(List.scala:381) at org.broadinstitute.pilon.GapFiller.recruitReadsOfType(GapFiller.scala:366) at org.broadinstitute.pilon.GapFiller.recruitFrags(GapFiller.scala:375) at org.broadinstitute.pilon.GapFiller.recruitLocalReads(GapFiller.scala:389) at org.broadinstitute.pilon.GapFiller.recruitReads(GapFiller.scala:391) at org.broadinstitute.pilon.GapFiller.assembleAcrossBreak(GapFiller.scala:51) at org.broadinstitute.pilon.GapFiller.fixBreak(GapFiller.scala:45) at org.broadinstitute.pilon.GenomeRegion$$anonfun$identifyAndFixIssues$4.apply(GenomeRegion.scala:383) at org.broadinstitute.pilon.GenomeRegion$$anonfun$identifyAndFixIssues$4.apply(GenomeRegion.scala:381) at scala.collection.immutable.List.foreach(List.scala:381) at org.broadinstitute.pilon.GenomeRegion.identifyAndFixIssues(GenomeRegion.scala:381) at org.broadinstitute.pilon.GenomeFile$$anonfun$processRegions$4.apply(GenomeFile.scala:120) at org.broadinstitute.pilon.GenomeFile$$anonfun$processRegions$4.apply(GenomeFile.scala:109) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:972) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51) at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:969) at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443) at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

zx0223winner commented 7 years ago

Move on, by checking the other tickets, With --unpaired option it works but give the report like this, am not sure if the contig was really polished?

screen shot 2017-07-11 at 10 57 44 am

w1bw commented 7 years ago

Pilon could certainly do a better job of crash reporting, but as you've guessed, the original issue was:

"Caused by: java.lang.IllegalStateException: Inappropriate call if not paired read"

There were unpaired reads in the input bam. If these are short reads, it's going to have a difficult time doing local reassemblies, because there aren't reads to reach across potential misassembly areas or to fill gaps.

Looking at the output snipped above, it looks as if it's trying to do local reassemblies every Kbp or so, and that's a ton. If you mostly care about fixing up bases and small indels, you might try "--fix bases", and it will skip trying to do local reassemblies and run much faster (and with less memory).