Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
284 stars 45 forks source link

Only 9 features were used for prediction #55

Closed B-WingBreaker closed 1 year ago

B-WingBreaker commented 1 year ago

Hello. Thank you for developing this wonderful tool.

I am trying to apply CellTypist to my dataset composed of mouse heart cells. Besides the immune cells, I have preliminarily annotated the endothelial cells and fibroblast cells in this dataset too.

I am not familiar with Scanpy so the AnnData file was converted from the Seurat Object following the instructions by SeuratDisk. https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html

The raw count matrix was scaled to 10,000 and log normalized by the NormalizeData funtion using Seurat. cre <- NormalizeData(cre, normalization.method = "LogNormalize", scale.factor = 10000)

However, when I tried predicting the annotation of this converted AnnData file, the result told me that almost all the predicted_labels are "Double-positive thymocytes" in my data, which was not possible at all.

Then I checked for the reason and I found that only 9 features were used for prediction.

Here is what CellTypist output.

# Predict the identity of each input cell. cre_predictions = celltypist.annotate( cre, model = 'Immune_All_Low.pkl', majority_voting = True)

🔬 Input data has 9267 cells and 20011 genes 🔗 Matching reference genes in the model 🧬 9 features used for prediction ⚖️ Scaling input data 🖋️ Predicting labels ✅ Prediction done! 👀 Can not detect a neighborhood graph, will construct one before the over-clustering

No error was reported.

Was it caused by the conversion from Seurat to AnnData? Would it be better if I turned to use Scanpy from the beginning?

ChuanXu1 commented 1 year ago

@B-WingBreaker, the immune model is from human, so very few genes are overlapped with your mouse data.

You can convert the human model to mouse one, and then apply CellTypist using the new model.

model = celltypist.Model.load('Immune_All_Low.pkl') model.convert() model.write('transformed_mouse_immune_model.pkl') celltypist.annotate(your_adata, model = 'transformed_mouse_immune_model.pkl', majority_voting = True)

The result should be interpreted with caution due to inter-species difference.

B-WingBreaker commented 1 year ago

@B-WingBreaker, the immune model is from human, so very few genes are overlapped with your mouse data.

You can convert the human model to mouse one, and then apply CellTypist using the new model.

model = celltypist.Model.load('Immune_All_Low.pkl') model.convert() model.write('transformed_mouse_immune_model.pkl') celltypist.annotate(your_adata, model = 'transformed_mouse_immune_model.pkl', majority_voting = True)

The result should be interpreted with caution due to inter-species difference.

@ChuanXu1 This solved my problem well. Most of the predictions seemed to match my preliminary annotation. Thank you very much, ChuanXu1!

Ahmedalaraby20 commented 1 year ago

I ran model.convert() and I got AttributeError` type object 'Model' has no attribute 'convert' Did something change ?

ChuanXu1 commented 1 year ago

@Ahmedalaraby20, can you confirm your version by celltypist.__version__?

Ahmedalaraby20 commented 1 year ago

its '0.1.9'

import celltypist
from celltypist import models
model = celltypist.Model.load('Immune_All_Low.pkl')
model.convert()
model.write('transformed_mouse_immune_model.pkl')
celltypist.annotate(your_adata, model = 'transformed_mouse_immune_model.pkl', majority_voting = True)

Am I doing it wrong?

ChuanXu1 commented 1 year ago

@Ahmedalaraby20, this version is a bit old. Please try upgrading it by uninstalling and then installing a new one.

Jammm11 commented 10 months ago

@B-WingBreaker, the immune model is from human, so very few genes are overlapped with your mouse data.

You can convert the human model to mouse one, and then apply CellTypist using the new model.

model = celltypist.Model.load('Immune_All_Low.pkl') model.convert() model.write('transformed_mouse_immune_model.pkl') celltypist.annotate(your_adata, model = 'transformed_mouse_immune_model.pkl', majority_voting = True)

The result should be interpreted with caution due to inter-species difference.

Could you tell me the corresponding Linux command? I have not used python. Thank you very much!

ChuanXu1 commented 10 months ago

@B-WingBreaker, the immune model is from human, so very few genes are overlapped with your mouse data. You can convert the human model to mouse one, and then apply CellTypist using the new model. model = celltypist.Model.load('Immune_All_Low.pkl') model.convert() model.write('transformed_mouse_immune_model.pkl') celltypist.annotate(your_adata, model = 'transformed_mouse_immune_model.pkl', majority_voting = True) The result should be interpreted with caution due to inter-species difference.

Could you tell me the corresponding Linux command? I have not used python. Thank you very much!

Cross-species model conversion is not possible with Linux command; you have to use the python code unfortunately.