Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
254 stars 40 forks source link

Potentially misleading error when counts are not in expected range #89

Closed padix-key closed 8 months ago

padix-key commented 9 months ago

If the count matrix is not log1p normalized to a maximum of 10000, Celltypist resorts to adata.raw.X instead of adata.X.

In my case adata.raw is not populated at all, so instead of hinting to the the unexpected counts, Celltypist raises

Exception: 🛑 Fail to use the `.raw` attribute in the input object. 'NoneType' object has no attribute 'X'

I would suggest adding an optional parameter, e.g. enforce_x, allow_raw_counts or something like that, that converts the log message

"👀 Invalid expression matrix in `.X`, expect log1p normalized expression to 10000 counts per cell; will try the `.raw` attribute")

into an exception.

ChuanXu1 commented 9 months ago

@padix-key, CellTypist first tries adata.X and if its format is met, it will predict cell types accordingly. If adata.X does not suffice, CellTypist will resort to adata.raw.X to see whether adata.raw.X works. If both do not work, an error will be raised. Please let me know whether it is clear to you.

padix-key commented 8 months ago

Sorry for the late response. After source code inspection I also found this behavior and I think it is generally fine. However, in my opinion the error message does not convey the actual problem: If I have an AnnData with X but without raw, the problem is that X is not properly normalized, not that AnnData.raw is None.

ChuanXu1 commented 8 months ago

@padix-key, error messages have been redesigned 42c2c1c8ba63a48d432087549891765ebac7c818 These new error messages will be available in the next version. I will close the issue and please reopen it if you have further questions or suggestions.