Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
284 stars 45 forks source link

Using model trained by scRNA-seq datasets to predict Spatial transcriptmoic dataset #94

Closed SNOL2 closed 11 months ago

SNOL2 commented 11 months ago

Hi, thanks for developing and maintaining this wonderful tool! I was wondering if you can give some advice on the prediction of ST(e.g., MERFISH) datasets using model trained by scRNA-seq datasets. Additionally, MERFISH data was not normalized to 10000 counts per cell as suggested by Squidpy(https://squidpy.readthedocs.io/en/stable/notebooks/tutorials/tutorial_vizgen.html). Therefore, is it necessary to normalize ST data to 10000 counts per cell for Celltypist prediction?

ChuanXu1 commented 11 months ago

@SNOL2, it's not advisable to predict ST data using CellTypist models, as each spot/voxel will only be predicted into one cell type identity rather than a mixture.

However if you really want to try this approach, you still need to normalize to 10000 and log1p the expression matrix as with scRNA-seq data.

Another thing to mention is that for some kinds of ST data with only a limited panel of genes, you'd better re-train a CellTypist model using only these genes, and use the derived model to predict the ST data (rather than training a model using all genes in the scRNA-seq data and then predicting the ST data).

SNOL2 commented 11 months ago

Thanks! I will give it a try.