milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
323 stars 78 forks source link

allVHitsWithScore allDHitsWithScore allJHitsWithScore allCHitsWithScore #1689

Closed 2061574124 closed 2 months ago

2061574124 commented 2 months ago

i run the code just as following: mixcr align -Xmx15g -p rna-seq -s hsa -OallowPartialAlignments=true ./f1.rmrrna.fastp.fastq.1.gz ./f1.rmrrna.fastp.fastq.2.gz ./f1.vdjca; mixcr assemblePartial -Xmx15g ./f1.vdjca ./f1.alignments_rescued_1.vdjca mixcr assemblePartial -Xmx15g ./f1.alignments_rescued_1.vdjca ./f1.alignments_rescued_2.vdjca mixcr extend -Xmx15g ./f1.alignments_rescued_2.vdjca ./f1.alignments_rescued_2_extended.vdjca mixcr assemble -Xmx15g ./f1.alignments_rescued_2_extended.vdjca ./f1.clones.clns mixcr exportClones -Xmx15g ./f1.clones.clns ./f1.clones.txt; done mixcr exportClones -Xmx15g -c IGH ./f1.clones.clns ./f1.IGH.clones.txt

After that, f1.IGH.clones.txt results shown that some reads could mapping to multi-VDJ. is this is normal?

cloneId | cloneCount | cloneFraction | targetSequences | targetQualities | allVHitsWithScore | allDHitsWithScore | allJHitsWithScore | allCHitsWithScore | allVAlignments | allDAlignments | allJAlignments | allCAlignments | nSeqFR1 | minQualFR1 | nSeqCDR1 | minQualCDR1 | nSeqFR2 | minQualFR2 | nSeqCDR2 | minQualCDR2 | nSeqFR3 | minQualFR3 | nSeqCDR3 | minQualCDR3 | nSeqFR4 | minQualFR4 | aaSeqFR1 | aaSeqCDR1 | aaSeqFR2 | aaSeqCDR2 | aaSeqFR3 | aaSeqCDR3 | aaSeqFR4 | refPoints -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- 3 | 28 | 0.019270475 | TGTGGATCAGATGCCAGAGCACCAGACGGCAATGGTTATCACTATGCTTTTGCTATGTGG | NNNNNNNNNNNNNNNNNNNN???????????????????????????????????????? | IGHV1-69-2*00(249.2) | IGHD3-22*00(39),IGHD3-3*00(35),IGHD5-18*00(35) | IGHJ3*00(156.5) | IGHG1*00(459),IGHG3*00(402.8) | 359\|371\|390\|0\|12\|SC363GSA365T\|28.0 | 49\|60\|93\|32\|43\|ST56C\|39.0;49\|56\|93\|32\|39\|\|35.0;32\|39\|60\|31\|38\|\|35.0 | 22\|39\|70\|43\|60\|SA31CSC35G\|53.0 | ; |   |   |   |   |   |   |   |   |   |   | TGTGGATCAGATGCCAGAGCACCAGACGGCAATGGTTATCACTATGCTTTTGCTATGTGG | 30 |   |   |   |   |   |   |   | CGSDARAPDGNGYHYAFAMW |   | :::::::::0:1:12:32:-18:-2:43:43:-2:60::: 14 | 12 | 0.008258775 | TGTGCGAGAGTCGGGTACCGTGGTGGTGCCACACCTTTTGACTACTGG | ???????????????????????????????????????????????? | IGHV4-39*00(337.6) | IGHD2-21*00(45),IGHD2-15*00(40) | IGHJ4*00(200) | IGHM*00(372.3) | 445\|454\|476\|0\|9\|\|45.0 | 36\|45\|84\|19\|28\|\|45.0;42\|50\|93\|19\|27\|\|40.0 | 25\|37\|68\|36\|48\|\|60.0 |   |   |   |   |   |   |   |   |   |   |   | TGTGCGAGAGTCGGGTACCGTGGTGGTGCCACACCTTTTGACTACTGG | 30 |   |   |   |   |   |   |   | CARVGYRGGATPFDYW |   | :::::::::0:-2:9:19:-8:-11:28:36:-5:48::: 33 | 8 | 0.00550585 | TGTGCGCGACATCGGGCAGTCCCGGAGTTCGGGGAGGGACTTGACTATTGG | ??????????????????????????????????????????????????? | IGHV1-46*00(73.5),IGHV3-53*00(72.3) | IGHD3-10*00(50) | IGHJ4*00(174) | IGHG2*00(832.8) | 442\|455\|473\|0\|13\|SA448CSG451C\|33.0;439\|452\|470\|0\|13\|SA445CSG448C\|33.0 | 42\|52\|93\|26\|36\|\|50.0 | 26\|37\|68\|40\|51\|SC33T\|39.0 |   |   |   |   |   |   |   |   |   |   |   | TGTGCGCGACATCGGGCAGTCCCGGAGTTCGGGGAGGGACTTGACTATTGG | 30 |   |   |   |   |   |   |   | CARHRAVPEFGEGLDYW |   | :::::::::0:2:13:26:-11:-10:36:40:-6:51::: 34 | 8 | 0.00550585 | TGTGCAAGAGATTCCGGAAACATTTATGGCGCGTACTACTTTGATTCCTGG | ??????????????????????????????????????????????????? | IGHV6-1*00(340.4) | IGHD2-21*00(35),IGHD2-15*00(30),IGHD4-23*00(30) | IGHJ4*00(200) | IGHG2*00(369.4) | 454\|466\|485\|0\|12\|\|60.0 | 53\|60\|84\|12\|19\|\|35.0;59\|65\|93\|12\|18\|\|30.0;35\|41\|57\|12\|18\|\|30.0 | 18\|37\|68\|32\|51\|SC30TSA32C\|63.0 |   |   |   |   |   |   |   |   |   |   |   | TGTGCAAGAGATTCCGGAAACATTTATGGCGCGTACTACTTTGATTCCTGG | 30 |   |   |   |   |   |   |   | CARDSGNIYGAYYFDSW |   | :::::::::0:1:12:12:-25:4:19:32:2:51::: 38 | 7 | 0.004817619 | TGTGCGAGAGAATCGACTGAGACATGGCTACAAGTGTCCAACTATTTTGAACACTGG | ????????????????????????????????????????????????????????? | IGHV1-2*00(287.3) | IGHD5-24*00(54) | IGHJ4*00(162.8),IGHJ5*00(145.8) | IGHA1*00(527.5) | 442\|453\|473\|0\|11\|\|55.0 | 22\|36\|60\|19\|33\|SG25C\|54.0 | 20\|37\|68\|40\|57\|SC24TSC30AST31C\|37.0;36\|40\|71\|53\|57\|\|20.0 |   |   |   |   |   |   |   |   |   |   |   | TGTGCGAGAGAATCGACTGAGACATGGCTACAAGTGTCCAACTATTTTGAACACTGG | 30 |   |   |   |   |   |   |   | CARESTETWLQVSNYFEHW |   | :::::::::0:0:11:19:-2:-4:33:40:0:57:::
mizraelson commented 2 months ago

Yes, the BCR gene segments often have very similar sequences. If your reads do not cover the entire gene sequence, there might be several candidates. In the brackets following the gene names, you see the alignment scores. The first gene in the list is the most probable one with the highest score.