Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
292 stars 46 forks source link

SCT Object Compatibility with celltypist #132

Open katkatrach opened 2 months ago

katkatrach commented 2 months ago

Hello,

I am currently following a celltypist tutorial in which it can be used on a .h5ad object. My original object is Seurat, so I converted it first to h5Seurat and then to h5ad. I loaded in the h5ad file with scanpy.

Here are some attributes that print when I look at the object: object.shape (139380, 7630) object.X [[-0.18042335, 2.62392346, -0.24814961, ... etc object.raw.X <Compressed Sparse Row sparse matrix of dtype 'float64' with 177847606 stored elements and shape (139380, 31227)> object.var_names Index(['LINC01409', 'SAMD11', 'HES4', 'ISG15', etc

I am trying to annotate with the following predictions = celltypist.annotate(object, model = 'Immune_All_Low.pkl', majority_voting = True, mode = 'best match') and have also tried transposing input. I get the same error each time: first that it will use raw.X: 👀 Invalid expression matrix in '.X', expect log1p normalized expression to 10000 counts per cell; will use '.raw.X' instead ⚠️ Warning: invalid expression matrix, expect ALL genes and log1p normalized expression to 10000 counts per cell. The prediction result may not be accurate

I use Seurat SCTranform which does both normalization and scaling.

and then the following: ValueError: 🛑 No features overlap with the model. Please provide gene symbols

Is there anything I should do to make sure the object is in the correct format? I am stuck on what to do my object, and would prefer not to re-scale my matrix.

Thank you!

katkatrach commented 2 months ago

I fixed the issue with scanpy normalization and scaling on my object!

ChuanXu1 commented 2 months ago

@katkatrach, glad you resolved it:) CellTypist needs normalized expression to 10000 counts per cell as input, which is incompatible with SCT that relies on Pearson residuals for normalization.