hartleys / QoRTs

Quality of RNA-Seq Toolset
52 stars 14 forks source link

FATAL ERROR Error info: Exception in thread "main" java.lang.NullPointerException #81

Open IlariaPalmi opened 4 years ago

IlariaPalmi commented 4 years ago

Dear Doctor, I am running the code on my bam files and I get FATAL error message. here below my code: for i in $(ls $RDS/projects/seqdataset/ephemeral/RNASEQ/RNAseqROX/140725_0281_IP/140725_SN674_0281_AC47BHACXX/Unaligned/Project_IP) do java -Xmx4G -jar $RDS/home/anaconda3/pkgs/qorts-1.3.6-0/share/qorts-1.3.6-0/QoRTs.jar QC \ --generatePlots \ --addFunctions mismatchEngine,annotatedSpliceExonCounts,FPKM,writeGeneBodyIv,fastqUtils,writeDocs,makeJunctionBed,makeWiggles,makeAllBrowserTracks,calcDetailedGeneCounts \ --verbose \ --stranded \ --readGroup ls $RDS/projects/seqdataset/ephemeral/RNASEQ/RNAseqROX/140725_0281_IP/140725_SN674_0281_AC47BHACXX/Unaligned/Project_IP/$i/*_R1_*.fastq.gz | sort | awk '{gsub(/.*\//,""); gsub(/.fastq.gz/,""); gsub(/R1/,""); printf "ID:" $i " , "}' | head -c -3 \ --outfilePrefix $i \ --chromSizes $RDS/projects/sequ/live/Genecode/mouseindex/chrNameLength.txt \ --rawfastq ls $RDS/projects/seqdataset/ephemeral/RNASEQ/RNAseqROX/140725_0281_IP/140725_SN674_0281_AC47BHACXX/Unaligned/Project_IP/$i/*.fastq.gz | sort | tr "\n" "," \ --genomeFA $RDS/projects/sequ/live/Genecode/mouseannotation/GRCm38.primaryassembly.genome.fa \ $RDS/projects/sequ/live/mapped/Genecodesecondpassfilter/ROX$i*.sortedByCoord.out.bam \ $RDS/projects/sequ/live/Genecode/mouseannotation/gencode.vM25.primary_assembly.annotation.gtf \ $RDS/projects/sequ/live/mapped/QoRT_QC/$i done

and the error message is:

oRTs encountered a FATAL ERROR. For general help, use command: java -jar path/to/jar/QoRTs.jar --man ============================FATAL_ERROR============================ Error info: Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:SNA2_TGACCA_L008__001; File /rds/general/user/ipalmisa/projects/sequ/live/mapped/Genecodesecondpassfilter/ROX_Sample_SNA2Aligned.sortedByCoord.out.bam; Line number 69 at net.sf.samtools.SAMTextHeaderCodec.reportErrorParsingLine(SAMTextHeaderCodec.java:234) at net.sf.samtools.SAMTextHeaderCodec.access$200(SAMTextHeaderCodec.java:40) at net.sf.samtools.SAMTextHeaderCodec$ParsedHeaderLine.requireTag(SAMTextHeaderCodec.java:316) at net.sf.samtools.SAMTextHeaderCodec.parseRGLine(SAMTextHeaderCodec.java:164) at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:97) at net.sf.samtools.BAMFileReader.readHeader(BAMFileReader.java:492) at net.sf.samtools.BAMFileReader.(BAMFileReader.java:162) at net.sf.samtools.BAMFileReader.(BAMFileReader.java:121) at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:647) at net.sf.samtools.SAMFileReader.(SAMFileReader.java:184) at net.sf.samtools.SAMFileReader.(SAMFileReader.java:139) at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1023) at qcUtils.runAllQC$.run(runAllQC.scala:970) at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:680) at runner.runner$.main(runner.scala:98) at runner.runner.main(runner.scala)

Can you please help me in this? thanks BW ilaria

hartleys commented 4 years ago

So a few things:

It looks like you have a big complicated expression to get the RG entry? Are you sure that thing is working properly? Have you tried running it separately? Have you tried running it with the RG stated explicitly? Also your expressions for RG and fastq don't appear to be inside a $(), so wouldn't they just be interpreted as text? I'm not sure how this is running at all? What shell are you using?

Also, the error itself seems to be that the RG field doesn't have an SM tag, so it is trying to extract the sample name and failing I guess? It looks like this is a restriction built into the HTSeq library, so I can't fix it. The problem might be solved by adding an SM tag to each RG entry. See the SAM spec: https://samtools.github.io/hts-specs/SAMv1.pdf (which it looks like you're already pretty familiar with).