milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
335 stars 79 forks source link

mixcr assemblePartial issue #323

Closed yingzhang121 closed 6 years ago

yingzhang121 commented 6 years ago

Hi, developer,

Here is my command: mixcr assemblePartial aln.vdjca aln_rescued_1.vdjca Below is the error message: Building index: 0% Building index: 13% ETA: 00:00:13 Building index: 28.5% ETA: 00:00:09 Building index: 43.1% ETA: 00:00:07 Building index: 59.7% ETA: 00:00:04 Building index: 76.6% ETA: 00:00:02 Building index: 91.4% ETA: 00:00:01 Searching for overlaps: 0% Exception in thread "main" java.lang.IllegalArgumentException at com.milaboratory.mixcr.basictypes.VDJCAlignments.mergeOriginalReads(VDJCAlignments.java:86) at com.milaboratory.mixcr.partialassembler.TargetMerger.merge(TargetMerger.java:114) at com.milaboratory.mixcr.partialassembler.PartialAlignmentsAssembler.searchOverlaps(PartialAlignmentsAssembler.java:367) at com.milaboratory.mixcr.partialassembler.PartialAlignmentsAssembler.searchOverlaps(PartialAlignmentsAssembler.java:127) at com.milaboratory.mixcr.cli.ActionAssemblePartialAlignments.go(ActionAssemblePartialAlignments.java:62) at com.milaboratory.cli.JCommanderBasedMain.main(JCommanderBasedMain.java:155) at com.milaboratory.mixcr.cli.Main.main(Main.java:115)

The message is so vogue for me to debug by myself. Could you shed some light?

Best,

PoslavskySV commented 6 years ago

Which version of MiXCR? Could you please confirm this bug with the most recent version (there were quite many bug fixes recently)

PoslavskySV commented 6 years ago

As I see you use development version; I confirm this is a bug. Will fix it soon. Thanks!

yingzhang121 commented 6 years ago

Ok, thank you. I am about to post the version info. Anyway, it is:

mixcr -v MiXCR v2.2-SNAPSHOT (built Thu Jan 11 08:51:59 CST 2018; rev=c6be856; branch=develop) RepSeq.IO v1.2.11-SNAPSHOT (rev=55b408e) MiLib v1.8.2-SNAPSHOT (rev=94a3df8) Built-in V/D/J/C library: repseqio.v1.4

Library search path:

dbolotin commented 6 years ago

Can you confirm that aln.vdjca file is not a result of merging of several other .vdjca files? Please post full pipeline you used to produce this file.

Please also try reproduce it (starting from raw data) with the latest stable version (2.1.8), and confirm that the same error is observed.

Thanks!

yingzhang121 commented 6 years ago

Sorry that I couldn't go back to you earlier. I was busy fixing my NCBI genome submission (still not done).

Here is all my commands (so far) mixcr align -p rna-seq -OallowPartialAlignments=true -f --library imgt -s 9544 work/Amazo_D-4_S8_R1_001.fastq work/Amazo_D-4_S8_R2_001.fastq aln.vdjca mixcr assemblePartial aln.vdjca aln_rescued_1.vdjca

If necessary, I could put the aln.vdjca file in a place where you can download it.

yingzhang121 commented 6 years ago

So far, the 2.1.8 version has passed the breakpoint.

$ mixcr.old align -p rna-seq -OallowPartialAlignments=true -f --library imgt -s 9544 work/Amazo_D-4_S8_R1_001.fastq work/Amazo_D-4_S8_R2_001.fastq aln.vdjca Reference library: imgt.201802-5.sv2.1.7-14-gc6be856:9544:af752e185436b171b876d05083d5f8b8 WARNING: forcing -OvParameters.geneFeatureToAlign=VRegionWithP since current gene feature (VTranscriptWithout5UTRWithP) is absent in 100% of V genes. WARNING: 2 functional genes were excluded, re-run with -v option to see the list of excluded genes and exclusion reason. Alignment: 0.1% Alignment: 10.8% ETA: 00:01:47 Alignment: 21.7% ETA: 00:01:40 Alignment: 32.3% ETA: 00:01:29 Alignment: 43.3% ETA: 00:00:56 Alignment: 54.2% ETA: 00:00:42 Alignment: 65% ETA: 00:00:32 Alignment: 75.8% ETA: 00:00:22 Alignment: 86.3% ETA: 00:00:13 Alignment: 97.3% ETA: 00:00:02 ============= Report ============== Analysis time: 1.76m Total sequencing reads: 1526614 Successfully aligned reads: 755493 (49.49%) Paired-end alignment conflicts eliminated: 142718 (9.35%) Alignment failed, no hits (not TCR/IG?): 721090 (47.23%) Alignment failed because of absence of CDR3 parts: 10465 (0.69%) Alignment failed because of low total score: 39566 (2.59%) Overlapped: 958012 (62.75%) Overlapped and aligned: 532950 (34.91%) Alignment-aided overlaps: 31704 (5.95%) Overlapped and not aligned: 425062 (27.84%)

$ mixcr.old assemblePartial aln.vdjca aln_rescued_1.vdjca Building index: 0% Building index: 17.4% ETA: 00:00:09 Building index: 28% ETA: 00:00:06 Building index: 38.9% ETA: 00:00:05 Building index: 58.3% ETA: 00:00:04 Building index: 68.9% ETA: 00:00:02 Building index: 79.9% ETA: 00:00:01 Building index: 100% ETA: 00:00:00 Searching for overlaps: 0% Searching for overlaps: 0.7% ETA: 04:54:44 Searching for overlaps: 1.1% ETA: 07:20:12 Searching for overlaps: 1.7% ETA: 05:50:09

PoslavskySV commented 6 years ago

@dbolotin bug was introduced in the develop branch (after refactoring of VDJCAlignments)

dbolotin commented 6 years ago

Overall, this seems to be ok, this functionality is not finished yet.

Please use stable MiXCR releases. We release new features as soon as they are ready, so there is no much reason to use development versions for "production" data analysis.

Anyway, thanks for reporting! I will leave this opened, until this will be fixed in develop.

yingzhang121 commented 6 years ago

@dbolotin , thank you. I did realize that I pulled the develop branch after I couldn't make mixcr work out of box. But I "thought" I could resolve the issues, so I didn't switch to "master" timely. Will only work with stable release in the future.