kmayerb / tcrdist3

flexible CDR based distance metrics
MIT License
53 stars 17 forks source link

About distance threshold #62

Open showteeth opened 2 years ago

showteeth commented 2 years ago

I notice that tcrdist3 is derived from TCRdist. I want to know how the distance threshold in TCRdist is selected. The original paper just mentioned that the distance threshold was chosen to yield fairly homogeneous clusters of sufficient size (the same threshold was used for all repertoires) , but did not give a detailed explanation, can you introduce it in detail here, or give a script to realize this function?

kmayerb commented 2 years ago

I am not aware that a single distance threshold will be appropriate in all scenarios; however a TCRdist of 100 is a reasonably conservative distance for paired alpha beta data and would be a good starting point. One should evaluate cluster purity at a number of distance thresholds. See our recent paper in eLife for discussion of application of TCRdist to single chain data: https://elifesciences.org/articles/68605