donnafarberlab / MMoCHi

MultiModal Classifier Hierarchy (MMoCHi)
https://mmochi.readthedocs.io
GNU General Public License v3.0
7 stars 3 forks source link

Can you please provide us with mouse resources? Thank you! #4

Open nroak opened 2 months ago

nroak commented 2 months ago

This is a great tool! Thank you for development. Can you please generate/provide similar T subset resources for mouse datasets? It will be greatly useful as many labs including ours generate CITE-seq data from mouse models.

Also, has it been accepted in any journals yet? Looking forward to final publication.

Originally posted by @nroak in https://github.com/donnafarberlab/MMoCHi/issues/2#issuecomment-2316398535

Daniel-Caron commented 2 months ago

Hi Ninad,

So glad to hear you're finding MMoCHi useful. The manuscript is still in the process of review/revision, but fingers crossed it'll be published soon!

I primarily work with human data, so I'm not as familar with what markers are availble in the mouse CITE-seq panels, but it should be easy to create a MMoCHi hierarchy for mouse T subsets.

I'm assuming you've either sorted for T cells prior to sequencing or that you've had success delineating T cells vs other immune cells via other annotation methods. If you haven't, you could certainly design a MMoCHi hierarchy starting from the total immune component or all cells isolated from a tissue, you just need to know what cell types to expect in your data (For this, I highly recommend using unsupervised clustering on the CITE-seq data or cross-referencing with flow cytometry on similar samples).

I don't have much murine CITE-seq data to play with, but here're a few ideas I got from discussing with other members of my lab! You should alter this hierarchy as needed for your specific dataset.

Running MMoCHi always begins with initalizing a Hierarchy object. You can also use this line to set various default settings regarding how classification is performed across all levels of the hierarchy.

h = mmc.Hierarchy()

Which subsets you include in the hierarchy entirely depends on what cell types you expect to see in your sample. If you expect a mixture of conventional and unconventional T cells, you could attempt to segregate αβ T cells, γδ T cells, and NKT cells. Depending on what genes are highly expressed in your data, there are also various TCR transcripts for specific alpha or gamma chain.

h.add_classification('TCR','All', ['TCRαβ','TCRγδ','NK1.1']) 
h.add_subset('NKT cell','TCR',dict(pos = ['NK1.1'], neg = ['TCRγδ','TCRαβ']))
h.add_subset('γδ T cell','TCR',dict(pos = ['TCRγδ'], neg = ['TCRαβ','NK1.1']))
h.add_subset('αβ T cell','TCR',dict(pos = ['TCRαβ'], neg = ['TCRγδ','NK1.1']))

Once you're satisfied you have conventional T cells, you may want to separate CD4+ and CD8+ T cells (if you have high rates of double neg or double positive T cells, you should also add those in here as separate subsets). Depending on how well the CD4 and CD8 protein(s) stain, you may also want to use gene expression of these markers for high-confidence thresholding.

h.add_classification('CD4_CD8','αβ T cell', ['CD4','CD8a','CD8b']) 
h.add_subset('CD4+ T cell','CD4_CD8',dict(pos = ['CD4'], neg = ['CD8a','CD8b']))
h.add_subset('CD8+ T cell','CD4_CD8',dict(any_of = ['CD8a','CD8b'], neg = ['CD4']))

Next, you likely want to segregate naive T cells from antigen experienced T cells (Ag Exp). You can repeat the same thing for CD4s and CD8s (just replace the CD4 with CD8).

h.add_classification('CD4_ag_exp','CD4+ T cell',['CD11a','CD44','CD62L','CCR7','CD127'])  
h.add_subset('Naive CD4+ T cell','CD4_ag_exp',dict(any_of=['CD62L','CCR7','CD127'],neg=['CD11a','CD44'],n=2))
h.add_subset('Ag Exp CD4+ T cell','CD4_ag_exp',dict(pos=['CD11a','CD44']))

Following that, it really depends on what you're interested in... If you have stimulated cells (maybe from an infection/disease model) you may be interested in selecting the various T helper subsets (Th1/2/17, Tregs and Tfh, for which I'd suggest exploring the transcript expression of the various TFs: Tbx21, Gata3, Rorc, Foxp3, Bcl6 respectively, along with some surface markers of Tfh—I know that CXCR5 and PD-1 are both useful in humans). You could also go down the route of segregating various memory subsets from the antigen experienced T cells, such as central memory (CD62L+ or CCR7+) or effector memory (in humans Ccl5 is quite effective on the transcript-level).

I hope that this is somewhat helpful! If you're ever unsure about markers for specific subsets, I highly recommend referring to flow cytometry studies in the field. I hope MMoCHi ends up working out for you—let me know if you run into any issues or have further questions after trying some things out!

If you end up having success after refining this hierarchy, it'd be great if you could submit it for us to add to the documentation for the rest of the community to use as well!

nroak commented 2 months ago

Daniel, Thank you so much for such a thoughtful and comprehensive response. In fact, as we speak, I'm getting ready to present your preprint in our lab journal club. We have a couple of experts in T-cell and immune cell sorting from mouse models. So my hope is to create a comprehensive hierarchy based on their FACS experience and our understanding of the clustering. Comments from your lab mates and yourself will be an important guidance as we follow this process. I will keep you posted as we test mmochi (also great name choice!) over the next month or two. Once we are confident of our hierarchy, we will be more than happy to submit it as a documentation for everyone else to use. Cheers and good luck with the paper!