hartleys / QoRTs

Quality of RNA-Seq Toolset
52 stars 14 forks source link

Error info: Exception in thread "main" java.util.NoSuchElementException #83

Open royfrancis opened 4 years ago

royfrancis commented 4 years ago

Hi, I get the following error with QoRTs v1.3.6.

Runtime std out ``` Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018) Starting time: (Sat Sep 19 11:19:03 CEST 2020) INPUT_COMMAND(QC) INPUT_ARG(infile)=SRR3222409-19.bam INPUT_ARG(gtffile)=../reference/Mus_musculus.GRCm38.99-19.gtf INPUT_ARG(outdir)=./SRR3222409-19-qorts INPUT_ARG(isRNASeq)=true INPUT_ARG(generatePlots)=true INPUT_ARG(generateMultiPlot)=true INPUT_ARG(stranded)=true INPUT_ARG(noGzipOutput)=true INPUT_ARG(verbose)=true INPUT_ARG(maxReadLength)=Some(101) INPUT_ARG(outfilePrefix)=SRR3222409-19 INPUT_ARG(genomeFA)=Some(List(../reference/Mus_musculus.GRCm38.dna.chromosome.19.fa)) Created Log File: ./SRR3222409-19-qorts/SRR3222409-19QC.09TlhNf3W1sK.log Warning: run-in-progress file "./SRR3222409-19-qorts/SRR3222409-19QC.QORTS_RUNNING" already exists. Is there another QoRTs job running? Starting QC [Time: 2020-09-19 11:19:03] [Mem usage: [85MB / 2058MB]] [Elapsed Time: 00:00:00.0000] QoRTs is Running in paired-end mode. QoRTs is Running in any-sorted mode. Parameter --genomeFA found. Adding reference mismatch testing. NOTE: Function "overlapMatch" requires function "mismatchEngine". Adding "mismatchEngine" to the active function list... Running functions: CigarOpDistribution, GCDistribution, GeneCalcs, InsertSize, JunctionCalcs, NVC, QualityScoreDistribution, StrandCheck, chromCounts, cigarLocusCounts, mismatchEngine, overlapMatch, readLengthDistro, referenceMatch, writeBiotypeCounts, writeClippedNVC, writeDESeq, writeDEXSeq, writeGeneBody, writeGeneCounts, writeGenewiseGeneBody, writeJunctionSeqCounts, writeKnownSplices, writeNovelSplices, writeSpliceExon Checking first 10000 reads. Checking SAM file for formatting errors... Stats on the first 10000 reads: Num Reads Primary Map: 7645 Num Reads Paired-ended: 10000 Num Reads mapped pair: 7557 Num Pair names found: 3837 Num Pairs matched: 3626 Read Seq length: 91 to 101 Unclipped Read length: 91 to 101 Final maxReadLength: 101 maxPhredScore: 40 minPhredScore: 2 NOTE: Read length is not consistent. In the first 10000 reads, read length varies from 91 to 101 (param maxReadLength=101) Note that using data that is hard-clipped prior to alignment is NOT recommended, because this makes it difficult (or impossible) to determine the sequencer read-cycle of each nucleotide base. This may obfuscate cycle-specific artifacts, trends, or errors, the detection of which is one of the primary purposes of QoRTs!In addition, hard clipping (whether before or after alignment) removes quality score data, and thus quality score metrics may be misleadingly optimistic. A MUCH preferable method of removing undesired sequence is to replace such sequence with N's, which preserves the quality score and the sequencer cycle information. Note: Data appears to be paired-ended. Sorting Note: Reads are not sorted by name (This is OK). Sorting Note: Reads are sorted by position (This is OK). Done checking first 10000 reads. WARNINGS FOUND! Starting getSRPairIterResorted... SAMRecord Reader Generated. Read length: 101. [Time: 2020-09-19 11:19:10] [Mem usage: [301MB / 2595MB]] [Elapsed Time: 00:00:06.0886] > Init GeneCalcs Utility > Init InsertSize Utility Compiling flat feature annotation, internally in memory... FlatteningGtf: starting...(2020-09-19 11:19:17) FlatteningGtf: gtf file read complete.(2020-09-19 11:19:21) FlatteningGtf: Splice Junction Map read.(2020-09-19 11:19:22) FlatteningGtf: gene Sets generated.(2020-09-19 11:19:23) FlatteningGtf: Aggregate Sets built. FlatteningGtf: Compiling Aggregate Info . . . (2020-09-19 11:19:23) FlatteningGtf: Finished Compiling Aggregate Info. (2020-09-19 11:19:23) FlatteningGtf: Iterating through the step-vector...(2020-09-19 11:19:23) FlatteningGtf: Adding the aggregate genes themselves...(2020-09-19 11:19:24) FlatteningGtf: Iterating through the splice junctions...(2020-09-19 11:19:25) FlatteningGtf: Sorting the aggregate genes...(2020-09-19 11:19:26) FlatteningGtf: Folding the FlatGtfLine iterator...(2020-09-19 11:19:26) FlatteningGtf: Features Built.(2020-09-19 11:19:26) Internal flat feature annotation compiled! > Init NVC utility > Init CigarOpDistribution Utility > Init QualityScoreDistribution Utility > Init GC counts Utility > Init JunctionCalcs utility length of knownSpliceMap after instantiation: 9244 length of knownCountMap after instantiation: 9244 > Init StrandCheck Utility > Init chromCount Utility > Init qcCigarLocusCounts Utility > Init OverlapMatch Utility > Init MinorUtils Utility QC Utilities Generated! [Time: 2020-09-19 11:19:31] [Mem usage: [1368MB / 3670MB]] [Elapsed Time: 00:00:27.0629] ============================FATAL_ERROR============================ QoRTs encountered a FATAL ERROR. For general help, use command: java -jar path/to/jar/QoRTs.jar --man ============================FATAL_ERROR============================ Error info: Exception in thread "main" java.util.NoSuchElementException: SRR3222409.9665352 at scala.collection.mutable.AnyRefMap$ExceptionDefault.apply(AnyRefMap.scala:425) at scala.collection.mutable.AnyRefMap$ExceptionDefault.apply(AnyRefMap.scala:424) at scala.collection.mutable.AnyRefMap.apply(AnyRefMap.scala:180) at internalUtils.commonSeqUtils$$anon$5.next(commonSeqUtils.scala:1106) at internalUtils.commonSeqUtils$$anon$5.next(commonSeqUtils.scala:1036) at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.next(stdUtils.scala:969) at scala.collection.Iterator.foreach(Iterator.scala:929) at scala.collection.Iterator.foreach$(Iterator.scala:929) at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.foreach(stdUtils.scala:963) at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1300) at qcUtils.runAllQC$.run(runAllQC.scala:970) at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:680) at runner.runner$.main(runner.scala:98) at runner.runner.main(runner.scala) ```

And this is the code that I run:

prefix="${1##*/}"
prefix="${prefix/.bam/}"

QoRTs QC \
--RNA \
--generatePlots \
--generateMultiPlot \
--stranded \
--noGzipOutput \
--verbose \
--maxReadLength 101 \
--outfilePrefix ${prefix} \
--genomeFA ../reference/Mus_musculus.GRCm38.dna.chromosome.19.fa \
$1 \
../reference/Mus_musculus.GRCm38.99-19.gtf \
./${prefix}-qorts

I am attaching the BAM file, gtf and genome fa here for testing purposes.

SRR3222409-19.zip Mus_musculus.GRCm38.99-19.gtf.gz Mus_musculus.GRCm38.dna.chromosome.19.fa.gz

The BAM has been aligned using HISAT2. I am not sure if it is some issue with the BAM file. I have tested the same BAM file on qualimap 2.2.1 and it seems to work.