carmonalab / ProjecTILs

Interpretation of cell states using reference single-cell maps
GNU General Public License v3.0
248 stars 29 forks source link

Mouse Tcells classification not accurate #89

Open NoemieL opened 3 months ago

NoemieL commented 3 months ago

Hello,

We are using ProjecTILs on human scRNA and it works very well, however we have tried using it on mouse multiplex scRNA data from in vito T cells and the classification is not accurate. We have run run.projecTILs several times with different parameters but have not been able to classify our T cells correctly. As you can see in the attached images, when looking at certain markers, the cells classified based on the reference have different gene expressions than the reference. In particular, cells that do not express the Treg marker Foxp3 are labeled as Treg. Could you please take a look at our results and tell us how we can improve our T cell classification?

Here are some information: ref <- load.reference.map(ref="/path/to/refTILAtlasmouse_v1.rds") hash.ID containes the sample ID based on the hastag from multiplex library

1-Default parameters: test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID") plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

2-Filter.cell=FALSE test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID", filter.cell=FALSE) plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

3-Decreasing min confidence test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID", min.confidence=0.1) plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

4-Increasing min confidence test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID", min.confidence=0.3) plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

5-Increasing the number of neighbors test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID", k=10) plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

6-Using PCA reduction -> error when performing the radar plots test.obj=Run.ProjecTILs(query=test.obj, ref, reduction="umap",split.by="hash.ID", k=10) plot.states.radar(ref, query=test.obj, genes4radar=c("Mki67", "Foxp3", ......, "Tox") image

Similar results were obtained when the data were not split by sample. Thank you very much in advance for your hep.

Noemie & Kat

mass-a commented 3 months ago

Hello Noemie, thanks for the question and for using our tools :)

My guess is that your dataset contains cell states that are not present in the reference. You mention that these are in vitro-cultured T cells? it's been shown (e.g. Corria-Osorio et al. (2023)) that T cells take up non-canonical states when cultured in vitro. From the radar plots it seems that most cells are Gzmb+, indicating effector cytotoxic function. You seem to also have cytotoxic CD4+ T cells, a state that is not present in the reference (but that we should include!). That's a possible reason why they end up wrongly classified as Treg - the algorithm does not know where to place them.

Have you tried using the human references to analyse these data? in this case ProjecTILs relies on orthologs, but at least the maps should be more complete (including CD4+ cytotoxic states).

NoemieL commented 3 months ago

Hello Massimo,

Thank you for your feedback. Yes, these cells are from in vitro culture mouse CAR-T cells before injection into mice. As you mentioned we used the human reference, it is better but I still have some concerns with the classification and I would like to get your feedback on it (attached the radar plots). As you mentioned, most of the cells are more cytotoxic and not exhausted (TOX- and PD1-) but some of them are classified as TPEX, TEX and Treg (FOXP3-). image

mass-a commented 3 months ago

I agree that there is poor agreement between the expression profiles and those of the reference. But as we discussed, your data is likely composed of in vitro cell states that are not present in the reference maps. This is an intrinsic limitation of reference-based approaches: they can only accurately predict cell states that are represented in the reference. In this case, I think I would favor an unsupervised analysis (e.g. clustering, differential expression etc.) to fully characterize the diversity of your data. Do you agree?