compomics / DeepLC

DeepLC: Retention time prediction for (modified) peptides using Deep Learning.
https://iomics.ugent.be/deeplc
Apache License 2.0
52 stars 18 forks source link

Cholesky AttributeError from DeepLC fix by down versioning to scikit-learn==1.3.2, scipy==1.11.4 #74

Closed KarlClauser closed 1 month ago

KarlClauser commented 1 month ago

Hi Robbin,

I think you should take a look at tightening up some version requirements on DeepLC python dependency packages. Specifically, pygam, scipy, scikit-learn. Here's what I experienced:

On July 1, 2024 From my python script which calls the DeepLC function dlc.calibrate_preds, I got the error below after setting up a Windows 10 computer with a new Python installation of DeepLC with ‘pip install DeepLC’ after installing a minimal Python: Miniconda3 py311_23.10.0-1.

This installed: deeplc==2.2.36, deeplcretrainer==0.2.12, pygam==0.9.0

After reading the comments in the dependency file triggering the error: Lib\site-packages\pygam\utils.py I could see that the error is only triggered if there has been trouble installing dependencies that the pygam developer has foreseen.

import scipy as sp
from scipy import sparse
import numpy as np
from numpy.linalg import LinAlgError
try:
  from sksparse.cholmod import cholesky as spcholesky
  from sksparse.test_cholmod import CholmodNotPositiveDefiniteError
  SKSPIMPORT = True
except ImportError:
  SKSPIMPORT = False

Because I had another working system set up in January I could check some package versioning and was able to eliminate the warning by revising downward the versioning of 2 packages. From scikit-learn==1.5.0 scipy==1.14.0 To scikit-learn==1.3.2 scipy==1.11.4

There may be some higher versions of these packages that work, I just went to the last known good that I had previously set up on another computer.

The Error

  File "D:\SpectrumMill\millpy\DeepLCwrapper.py", line 421, in fRunDeepLC
    dlc.calibrate_preds(seq_df=dfCalIn)
  File "C:\ProgramData\Miniconda3\Lib\site-packages\deeplc\deeplc.py", line 1173, in calibrate_preds
    calibrate_output = self.calibrate_preds_func_pygam(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\Miniconda3\Lib\site-packages\deeplc\deeplc.py", line 845, in calibrate_preds_func_pygam
    gam_model_cv = LinearGAM(s(0), verbose=True).fit(predicted_tr, measured_tr)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\Miniconda3\Lib\site-packages\pygam\pygam.py", line 920, in fit
    self._pirls(X, y, weights)
  File "C:\ProgramData\Miniconda3\Lib\site-packages\pygam\pygam.py", line 705, in _pirls
    E = self._cholesky(S + P, sparse=False, verbose=self.verbose)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\Miniconda3\Lib\site-packages\pygam\pygam.py", line 485, in _cholesky
    L = cholesky(A, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\Miniconda3\Lib\site-packages\pygam\utils.py", line 81, in cholesky
    A = A.A
        ^^^
AttributeError: 'csr_matrix' object has no attribute 'A'
RobbinBouwmeester commented 1 month ago

Hi Karl,

Thank you for bringing this to my attention. Last week I did a fix for someone else that should have also fixed this, not sure if scikit-learn is the issue here.

It seems there is an issue with scipy and pygam (also largely unmaintained these days), and any version of scipy before 1.14.0 seems to work. In the future we are going to look at alternatives for pygam as it can cause these kinds of issues.

So since ef1c24fa887d4877eaf7efbef927092fc017986b and v2.2.37 this should be fixed.

Thanks again! Kind regards,

Robbin