immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
306 stars 65 forks source link

Clarification on how matches are determined within trackClonotypes? #385

Open jeremymsimon opened 11 months ago

jeremymsimon commented 11 months ago

📚 Documentation

Hi, I'm reading in 10x scTCR data processed via cellranger and using trackClonotypes to tabulate and plot proportions across samples. Is there documentation somewhere about how a given match across samples is determined? In other words, does aa+v+j require exact amino-acid matches of CDR3s or are mismatches allowed? What about clonotypes with multiple alphas, multiple betas, no alpha, etc - what are all the conditions a match is considered vs clonotypes discarded?

Thanks!

vadimnazarov commented 11 months ago

Hi @jeremymsimon

Thank you for using Immunarch! No mismatches or similarity searches are used in the tracking.

Immunarch works on the data you "feed" to it. If you feed a dataframe with both chains or only one chain – it will take this data. It takes data from the CDR3 columns and searches for the match.

Could you share more details on why you need to search for multiple or no chain sequences?

jeremymsimon commented 11 months ago

Ah- my apologies, I think I missed a key difference in the output of say MiXCR vs that of cellranger in that cellranger only supplies one hit CDR3/V/J call for each chain rather than both primary and secondary calls. So given that, immunarch must then look for exact matches like you describe and tabulate the proportions.

Is support for paired-chain analyses on MiXCR outputs in the works? I'd like to try that as well but it seems this is currently only supported for cellranger outputs.

Thanks!