svalkiers / clusTCR

CDR3 clustering module providing a new method for fast and accurate clustering of large data sets of CDR3 amino acid sequences, and offering functionalities for downstream analysis of clustering results.
Other
48 stars 9 forks source link

Clustering method for TCRdist #45

Closed jiangdada1221 closed 1 year ago

jiangdada1221 commented 1 year ago

Hi, I'm wondering what's the clustering method for TCRdist since it only gives the pairwise similarity scores. I observed many options in the tcrdist module in this repository. Thus, which one did you use for evaluating TCRdist?

Best, Yuepeng

svalkiers commented 1 year ago

Hi there! Indeed, there is no fixed value for TCRdist. For our benchmark comparisons, we evaluated the trade-off between clustering purity, consistency and retention and choose the TCRdist value that best reflected the clustering retention of the other methods included in the benchmarking (GLIPH2 & iSMART).

You can read more about the exact approach in the supplementary material of our publication. Feel free to reach out if you have any more questions.