phbradley / tcr-dist

Software tools for the analysis of epitope-specific T cell receptor (TCR) repertoires (scroll down for the README)
MIT License
79 stars 36 forks source link

Problems determining best_gappos #13

Closed jeremycfd closed 6 years ago

jeremycfd commented 6 years ago

I've noticed that in some situations the align_cdr3s function in tcr_distances.py will fail to find the best_gappos, causing an error. However, re-running the same dataset, I was able to get it to succeed about 20% of the time. I was unable to find any consistency in the center_cdr3 and member_cdr3 pairing that caused this to fail. Examples include: DVGYKL DPAGNTGKL GEGSNNRI GYNTNTGKL GDRYAQGL GDVDYAQGL

But in each of these cases, rerunning the code eventually resulted in getting through these cases without issue. I assume the stochasticity is introduced by the random_seend. Any thoughts on how to best address this @phbradley ?

Traceback (most recent call last): File "/mnt/Data/TCR_Git/public_pipeline/tcr-dist/make_tall_trees.py", line 683, in a,b = align_cdr3s( center_cdr3, member_cdr3, gap_character ) File "/mnt/Data/TCR_Git/public_pipeline/tcr-dist/tcr_distances.py", line 98, in align_cdr3s s0 = s0[:best_gappos+1] + gap_character*lendiff + s0[best_gappos+1:] UnboundLocalError: local variable 'best_gappos' referenced before assignment

phbradley commented 6 years ago

I think this could be explained at least in part by CDR3s of length 6. I've fixed that bug I believe (about to check it in), but that doesn't seem to explain the examples you give there. Hmmm... when I run align_cdr3s with those sequences I don't get an error...

jeremycfd commented 6 years ago

Hrm, how strange...