Closed hyjforesight closed 1 year ago
@hyjforesight, CellTypist needs the all-cell-by-all-gene matrix in a log normalised format in either .X
or .raw.X
. Since your .X is a scaled data (with negative values), CellTypist finally uses the .raw.X for prediction as an alternative. Btw, seems your .var_names are not gene symbols, as only 6 features overlap with the model.
it seems that using sc.pp.regress_out may cause error
adataConcat = sc.read_h5ad("***') sc.pp.normalize_total(adataConcat, target_sum=1e4) sc.pp.log1p(adataConcat) sc.pp.regress_out(adataConcat, ['total_counts', 'pct_counts_mt']) predictions = celltypist.annotate(adataConcat, model = 'Healthy_COVID19_PBMC.pkl', majority_voting = True) predictions.predicted_labels adata = predictions.to_adata()
error
Invalid expression matrix in
.X
, expect log1p normalized expression to 10000 counts per cell; will try the.raw
attribute Traceback (most recent call last): File "/lustre/home/acct-medzy/medzy-cai/.conda/envs/scRNA-cmf/lib/python3.9/site-packages/celltypist/classifier.py", line 307, in init self.indata = self.adata.raw.X AttributeError: 'NoneType' object has no attribute 'X'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File ".raw
attribute in the input object. 'NoneType' object has no attribute 'X'
@cmf1997, you can skip this command sc.pp.regress_out(adataConcat, ['total_counts', 'pct_counts_mt'])
for CellTypist prediction purpose as it yields negative values
@ChuanXu1 exactly do you recommend using regress out again after celltype prediction? now i skip regress and using bbknn to integrate multiple data
@cmf1997, CellTypist replies on log normalised expression (to 10,000). If you regress out some covariates, the format will not suffice. After CellTypist prediction using the log normalised data, you will get additional prediction-related columns in the .obs of the AnnData. Then you can do whatever you want to do for the downstream analyses, such as regressing out batches, highly variable gene selection, etc.
@hyjforesight, CellTypist needs the all-cell-by-all-gene matrix in a log normalised format in either
.X
or.raw.X
. Since your .X is a scaled data (with negative values), CellTypist finally uses the .raw.X for prediction as an alternative. Btw, seems your .var_names are not gene symbols, as only 6 features overlap with the model.
Hello Chuan, Thanks for the response. So the best input data for Celltypist is the log-transformed raw matrix before scaling?所以我只需要加载10X matrix,做简单的质控,去掉一些不要的细胞,然后只跑Scanpy里面的sc.pp.normalize_total(adata, target_sum=1e4) ,接着就跑CellTypist就行了,是啊?
Thank you! Best, Yuanjian
@hyjforesight, CellTypist needs the all-cell-by-all-gene matrix in a log normalised format in either
.X
or.raw.X
. Since your .X is a scaled data (with negative values), CellTypist finally uses the .raw.X for prediction as an alternative. Btw, seems your .var_names are not gene symbols, as only 6 features overlap with the model.Hello Chuan, Thanks for the response. So the best input data for Celltypist is the log-transformed raw matrix before scaling?所以我只需要加载10X matrix,做简单的质控,去掉一些不要的细胞,然后只跑Scanpy里面的sc.pp.normalize_total(adata, target_sum=1e4) ,接着就跑CellTypist就行了,是啊?
Thank you! Best, Yuanjian
Yes. sc.pp.normalize_total(adata, target_sum=1e4) -> sc.pp.log1p(adata) -> CellTypist run
Thank you @ChuanXu1 . I close this issue.
Same issue. I'v fixed it by converting the adata.X from np.float32 to np.float64 format. It seems celltypist doesnt accept float32?
Same issue. I'v fixed it by converting the adata.X from np.float32 to np.float64 format. It seems celltypist doesnt accept float32?
@woloorn, I don't think float32 will cause any problem for CellTypist. Does this happen for the newest version of CellTypist?
Hi Celltypist, Thanks for developing this amazing package!
First, I proceed the data by Scanpy and Harmony with log-normalization.
Then, I load the saved h5ad file and run Celltypist, but I met the error: Invalid expression matrix in
.X
, expect log1p normalized expression to 10000 counts per cell; will try the.raw
attribute`.The data has been log-normalized by 'Scanpy', right? How does this error happen?
Thanks in advance for all the kind help. Best, Yuanjian