Teichlab / tracer

TraCeR - reconstruction of T cell receptor sequences from single-cell RNAseq data
Other
122 stars 48 forks source link

Tracer assemble gives two chains with identical CDR3 sequence as different #118

Open diitaz93 opened 2 years ago

diitaz93 commented 2 years ago

Hi! Thanks for developing Tracer! I am using it with a dataset of ~4k cells and there is something strange with the output of assemble with one specific cell. The file filtered_TCRs.txt outputs two Beta chains but both have the same ID (see below). When you look closely you can note the following:

I don't know if this is unexpected behaviour. Should I assume both chains are the same? or is there a significant difference between the two sequences? The output of unfiltered_TCR.txt is exactly the same as filtered_TCR.txt.

#TCR_B#
##TRINITY_DN0_c0_g1_i2##
V segment:      TRBV7-8*03
D segment:      TRBD1*01
J segment:      TRBJ2-2*01
ID:     TRBV7-8_AGACGCTCAGGGGTGGTCACCG_TRBJ2-2
TPM:    3.29375
Productive:     True
Stop codon:     False
In frame:       True
CDR3aa: ASRRSGVVTGELF
CDR3nt: GCCAGCAGACGCTCAGGGGTGGTCACCGGGGAGCTGTTT

Segment query_id        subject_id      % identity      alignment length        mismatches      gap opens       gaps
    q start q end   s start s end   e value bit score
V       reversed|TRINITY_DN0_c0_g1_i2   TRBV7-8*03      99.286  140     1       0       0       78      217     148
     287     2.40e-58        216
V       reversed|TRINITY_DN0_c0_g1_i2   TRBV7-8*01      100.000 137     0       0       0       78      214     148
     284     7.08e-58        215
V       reversed|TRINITY_DN0_c0_g1_i2   TRBV7-8*02      99.270  137     1       0       0       78      214     148
     284     6.14e-57        212
V       reversed|TRINITY_DN0_c0_g1_i2   TRBV7-4*01      89.362  141     15      0       0       74      214     144
     284     1.11e-45        174
V       reversed|TRINITY_DN0_c0_g1_i2   TRBV7-2*01      93.496  123     8       0       0       92      214     162
     284     8.34e-44        168
D       reversed|TRINITY_DN0_c0_g1_i2   TRBD1*01        100.000 6       0       0       0       220     225     5
       10      4.0     12.4
J       reversed|TRINITY_DN0_c0_g1_i2   TRBJ2-2*01      100.000 47      0       0       0       230     276     5
       51      1.93e-22        93.7

##TRINITY_DN0_c0_g1_i1##
V segment:      TRBV7-8*01
D segment:      TRBD1*01
J segment:      TRBJ2-2*01
ID:     TRBV7-8_AGCAGACGCTCAGGGGTGGTCACCG_TRBJ2-2
TPM:    1472.31
Productive:     True
Stop codon:     False
In frame:       True
CDR3aa: ASRRSGVVTGELF
CDR3nt: GCCAGCAGACGCTCAGGGGTGGTCACCGGGGAGCTGTTT

Segment query_id        subject_id      % identity      alignment length        mismatches      gap opens       gaps
    q start q end   s start s end   e value bit score
V       reversed|TRINITY_DN0_c0_g1_i1   TRBV7-8*01      100.000 284     0       0       0       16      299     1
       284     8.99e-127       444
V       reversed|TRINITY_DN0_c0_g1_i1   TRBV7-8*03      99.303  287     2       0       0       16      302     1
       287     2.65e-126       442
V       reversed|TRINITY_DN0_c0_g1_i1   TRBV7-8*02      99.648  284     1       0       0       16      299     1
       284     7.80e-126       441
V       reversed|TRINITY_DN0_c0_g1_i1   TRBV7-6*01      90.845  284     26      0       0       16      299     1
       284     2.21e-102       363
V       reversed|TRINITY_DN0_c0_g1_i1   TRBV7-4*01      90.493  284     27      0       0       16      299     1
       284     1.91e-101       360
D       reversed|TRINITY_DN0_c0_g1_i1   TRBD1*01        100.000 6       0       0       0       305     310     5
       10      4.5     12.4
J       reversed|TRINITY_DN0_c0_g1_i1   TRBJ2-2*01      100.000 47      0       0       0       315     361     5
       51      2.17e-22        93.7