Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
260 stars 40 forks source link

TypeError: np.matrix is not supported. Please convert to a numpy array with np.asarray? #50

Closed bbimber closed 1 year ago

bbimber commented 1 year ago

Hello,

We've been running cell typist pretty regularly without issues, but recently saw this. I dont know yet whether this is a quirk in the input data or not, but I thought I'd report. We are running a basic celltypist command with a built-in model. The input is an AnnData file created by writing an R SeuratObject to disk.

This stack makes me wonder if some dependency updated and changed validation, like sklearn, but I havent debugged it yet.

Have you seen anything like this before? Thanks in advance.

09 Dec 2022 08:15:11,830 DEBUG:     Traceback (most recent call last):
09 Dec 2022 08:15:11,834 DEBUG:       File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
09 Dec 2022 08:15:11,838 DEBUG:         return _run_code(code, main_globals, None,
09 Dec 2022 08:15:11,843 DEBUG:       File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
09 Dec 2022 08:15:11,848 DEBUG:         exec(code, run_globals)
09 Dec 2022 08:15:11,852 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/celltypist/command_line.py", line 129, in <module>
09 Dec 2022 08:15:11,857 DEBUG:         main()
09 Dec 2022 08:15:11,862 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in __call__
09 Dec 2022 08:15:11,866 DEBUG:         return self.main(*args, **kwargs)
09 Dec 2022 08:15:11,872 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main
09 Dec 2022 08:15:11,877 DEBUG:         rv = self.invoke(ctx)
09 Dec 2022 08:15:11,881 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke
09 Dec 2022 08:15:11,885 DEBUG:         return ctx.invoke(self.callback, **ctx.params)
09 Dec 2022 08:15:11,889 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
09 Dec 2022 08:15:11,895 DEBUG:         return __callback(*args, **kwargs)
09 Dec 2022 08:15:11,900 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/celltypist/command_line.py", line 109, in main
09 Dec 2022 08:15:11,904 DEBUG:         result = annotate(
09 Dec 2022 08:15:11,910 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/celltypist/annotate.py", line 81, in annotate
09 Dec 2022 08:15:11,915 DEBUG:         predictions = clf.celltype(mode = mode, p_thres = p_thres)
09 Dec 2022 08:15:11,919 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/celltypist/classifier.py", line 376, in celltype
09 Dec 2022 08:15:11,924 DEBUG:         decision_mat, prob_mat, lab = self.model.predict_labels_and_prob(self.indata, mode = mode, p_thres = p_thres)
09 Dec 2022 08:15:11,937 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/celltypist/models.py", line 145, in predict_labels_and_prob
09 Dec 2022 08:15:11,945 DEBUG:         scores = self.classifier.decision_function(indata)
09 Dec 2022 08:15:11,951 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/sklearn/linear_model/_base.py", line 401, in decision_function
09 Dec 2022 08:15:11,957 DEBUG:         X = self._validate_data(X, accept_sparse="csr", reset=False)
09 Dec 2022 08:15:11,965 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/sklearn/base.py", line 535, in _validate_data
09 Dec 2022 08:15:11,971 DEBUG:         X = check_array(X, input_name="X", **check_params)
09 Dec 2022 08:15:11,977 DEBUG:       File "/usr/local/lib/python3.8/dist-packages/sklearn/utils/validation.py", line 737, in check_array
09 Dec 2022 08:15:11,983 DEBUG:         raise TypeError(
09 Dec 2022 08:15:11,993 DEBUG:     TypeError: np.matrix is not supported. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
09 Dec 2022 08:15:12,094 DEBUG:     Quitting from lines 182-193 (16-2-GEX.df.appendHashing.frc.cite.norm.pca.dr.RunCelltypist.rmd) 
bbimber commented 1 year ago

update: we just started hitting this on many datasets. I would guess some package updated to cause this, but am not sure yet.

bbimber commented 1 year ago

I bet it's related to this: https://github.com/scikit-learn/scikit-learn/pull/20165

and Scikit-learn 1.2.0 changing from a warning to error: https://github.com/scikit-learn/scikit-learn/releases/tag/1.2.0

ChuanXu1 commented 1 year ago

@bbimber, thank you for this. Yes, matrix seems not supported in new sklearn. I will fix this asap by turning all internal matrix to numpy array during data processing in CellTypist.

bbimber commented 1 year ago

ok, thanks

bbimber commented 1 year ago

@ChuanXu1: sorry to bug you on this, but I hope the fix is a minor change. do you think you'll be able to push another release? If not we will update our docker environments to force earlier versions of scikit. Thanks.

ChuanXu1 commented 1 year ago

@bbimber, new release and new version of python package is now out. c39f7533006af692913b3dbd92898fe90283e0dc 778103ca9260c1cb25079c78f9c9ae61e8b8a92a

bbimber commented 1 year ago

thanks!