Closed bbimber closed 8 years ago
This problem is caused by absence of VTranscript feature in IMGT data (namely there is no 5'UTR and Leader sequence). We've just added human readable message for this case (this will be released in 1.8).
Try to add the following option:
-OvParameters.geneFeatureToAlign=VRegion
to override VTranscript
feature used by rna-seq
set of parameters, and let me know if it fixed the problem.
Thanks for reporting this case! I will add some additional code to automatically fall back to VRegion in such cases.
Connected to #76
Thanks for the quick reply. Just one question: why did I only see this w/ the local IMGT DB? What is different about selecting -s hs, but not supplying 'local'?
The gene library imported from IMGT is activated only if you supply --library local
option.
-s hs
is the default value for this option, so actually it does'n change anything.
Our internal gene library was built from genomic data, directly from NCBI assembly of corresponding loci (see here), so along with VRegion part of the sequence it also contains all genomic sequences like UTRs and introns. This information helps to align these regions in particular types of data.
For example all 5'RACE protocols contain 5'UTR sequences that can be found in the second Illumina read, and by aligning this sequences with UTRs of V genes from internal DB MiXCR achieves greater specificity of V gene assignment.
Does it answers your question?
It does; however, I see a similar stack:
mixcr align -f -r Clone-27.log.txt -l TRA -p rna-seq --library local -OvParameters.geneFeatureToAlign=VRegion -s macaque ./Clone-27-FQ-R1.all.fastq ./Clone-27-FQ-R2.all.fastq Clone-27.macaque.TRA.vdjca Alignment: 100% Exception in thread "main" java.lang.NullPointerException at com.milaboratory.mixcr.vdjaligners.VDJCAlignerSJFirst.align(VDJCAlignerSJFirst.java:124) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerSJFirst.process(VDJCAlignerSJFirst.java:60) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerWithMerge.process(VDJCAlignerWithMerge.java:78) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerWithMerge.process(VDJCAlignerWithMerge.java:42) at cc.redberry.pipe.CUtils$9.process(CUtils.java:350) at cc.redberry.pipe.CUtils$9.process(CUtils.java:345) at cc.redberry.pipe.blocks.ParallelProcessor$Worker.run(ParallelProcessor.java:304) at java.lang.Thread.run(Thread.java:745)
The IMGT macaque DB lacks data for TRA (this is running as part of a script looping through the species/loci). Does that account for this NPE?
Note: the IMGT download tool gives a zero-line FASTA file for rhesus macaque TRA, just to be clear.
Yes it does. Unfortunately there are no TRAV genes for rhesus macaque in IMGT: http://www.imgt.org/vquest/refseqh.html
Is there a way to make the tool behave more gracefully in this situation?
Yes, current development version already outputs human readable error message for this case.
Hello,
I am running mixcr 1.7.1 using the command below. I downloaded the human library locally using importFromIMGT.sh. I get this NPE for multiple species downloaded locally:
mixcr align -f -r Clone-46.log.txt -l TRA -p rna-seq --library local -s hs ./Clone-46-FQ-R1.all.fastq ./Clone-46-FQ-R2.all.fastq Clone-46.hs.TRA.vdjca Alignment: 100% Exception in thread "main" java.lang.NullPointerException at com.milaboratory.mixcr.vdjaligners.VDJCAlignerSJFirst.align(VDJCAlignerSJFirst.java:124) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerSJFirst.process(VDJCAlignerSJFirst.java:60) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerWithMerge.process(VDJCAlignerWithMerge.java:78) at com.milaboratory.mixcr.vdjaligners.VDJCAlignerWithMerge.process(VDJCAlignerWithMerge.java:42) at cc.redberry.pipe.CUtils$9.process(CUtils.java:350) at cc.redberry.pipe.CUtils$9.process(CUtils.java:345) at cc.redberry.pipe.blocks.ParallelProcessor$Worker.run(ParallelProcessor.java:304) at java.lang.Thread.run(Thread.java:745)
I looked at the code a little and didnt see an obvious cause. Let me know if there's any information that would help.