ncborcherding / scRepertoire

A toolkit for single-cell immune profiling
https://www.borch.dev/uploads/screpertoire/
MIT License
311 stars 54 forks source link

Duplicate barcodes #388

Closed carlares closed 3 weeks ago

carlares commented 4 months ago

"Major note if there are duplicate barcodes (if a cell has both Ig and TCR), the immune receptor information will not be added. It might be worth checking cluster identities and removing incongruent barcodes in the products of [combineTCR()]and combineBCR().

Hey! Thank you so much for the package development, it has turned out to be quite useful during my analysis. I was just wondering if there are any plans of making the inclusion of both BCR and TCR info within "one cell" possible? More and more development is being done trying to distinguish and identify these doublets, as they might be useful in the context of cel to cell interactions. I understand so far it is not possible with this package, so I was wondering if this is something that might come up in an update?

Thank you so much!!

ncborcherding commented 4 months ago

Hey @carlares,

Thanks for reaching out and great point. The use of BCR/TCR dual cells for calling doublets is a particularly intriguing idea. As of right now, incoporating that workflow within scRepertoire might be too complicated for the central pipeline. But I could see a pre-processing step that would allow for the functionality.

Let me think more on it and let me know if you have any specific ideas/code.

Thanks, Nick

carlares commented 4 months ago

Hey @ncborcherding,

Great to know about your interest! I am more of an user than a coder, truth to be told. Correct me if I am completely offside, but as far as I understood when a cell shows both, as of now scRepertoire will just output NA is that right? I just thought it was a pity to throw away that kind of information. Could potentially turn out to be so interesting if these doublets are associated with specific clonotyes, expanded/non-expanded... Adding up to what you have already built.

Looking forward to any development! And thanks again,

Carla

Qile0317 commented 1 month ago

@carlares #417 at the moment of writing this comment is partially an in-development solution to this suggestion and let me know any specific features you'd like to see. At the moment it is just extracting doublets as a preprocessing step but it'd be quite easy to add more processing to integrate this data, produce summaries by groups/clusters, etc.

I think the assumption that cells with both TCRs and BCRs are scRNA doublets/bad data in general is pretty reasonable (unless you are interested in the mild possibility of actual dual expresssion complexes) and I also can add a shortcut function to filter seurat objects by these doublets. Again let me know if you (or anyone else) have any thoughts.

ncborcherding commented 3 weeks ago

We have added the functionality in a recent merge and I will close this issue for now.