tschuelia / PyPythia

Lightweight python library for predicting the difficulty of alignments in phylogenetics
GNU General Public License v3.0
16 stars 0 forks source link

Incompatibility of the lightgbm pickled object #12

Closed Gullumluvl closed 1 year ago

Gullumluvl commented 1 year ago

Hi,

I tried different fresh installs of PyPythia with Python 3.8, 3.9 and 3.10. On all of them I get the following error when attempting to evaluate an MSA:

[00:00:00] Starting prediction.
[00:00:00] Loading predictor /home/gull/.local/lib/python3.10/site-packages/pypythia/predictors/latest.pckl
[00:00:00] Checking MSA
[00:00:00] Starting to compute MSA features for MSA /home/gull/postdoc/data/05_cleaned-al/manual/PF00102_n1.einsi.fa
[00:00:00] Number of threads not specified, using RAxML-NG autoconfig.
[00:00:00] Retrieving num_patterns, percentage_gaps, percentage_invariant
[00:00:00] Retrieving num_taxa, num_sites
[00:00:00] Inferring 100 parsimony trees
[00:00:00] Computing the RF-Distance for the parsimony trees
[00:00:00] Predicting the difficulty
Traceback (most recent call last):
  File "/home/gull/.local/lib/python3.10/site-packages/pypythia/predictor.py", line 102, in predict
    prediction = self.predictor.predict(df, num_threads=1)
  File "/home/gull/.local/lib/python3.10/site-packages/lightgbm/sklearn.py", line 899, in predict
    predict_params = self._process_params(stage="predict")
  File "/home/gull/.local/lib/python3.10/site-packages/lightgbm/sklearn.py", line 674, in _process_params
    if self._n_classes > 2:
TypeError: '>' not supported between instances of 'NoneType' and 'int'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gull/.local/bin/pythia", line 8, in <module>
    sys.exit(main())
  File "/home/gull/.local/lib/python3.10/site-packages/pypythia/prediction.py", line 297, in main
    difficulty = predictor.predict(msa_features)
  File "/home/gull/.local/lib/python3.10/site-packages/pypythia/predictor.py", line 108, in predict
    raise PyPythiaException(
pypythia.custom_errors.PyPythiaException: An error occurred predicting the difficulty for the provided set of MSA features.

I seems to me that the lightgbm loaded object was instantiated with an older version that initialized _n_classes to None instead of -1 (in the code I have seen online).

So I made it run by setting self.predictor._n_classes = -1 just after unpickling, and this works, but I have no idea if this should be the value to use. In this case PyPythia gives 0.16 on the example alignment.

I also tried using a lightgbm install without openmp (pip install lightgbm --config-settings=cmake.define.USE_OPENMP=OFF), same error.

tschuelia commented 1 year ago

Hi, thanks for raising this issue. I suspect that this is due to a new major release of lightGBM and me forgetting to pin the major version in the requirements :-)

could you try to downgrade LightGBM to < 4.0.0 (but >= than 3.3) and run your MSA again? if you installed it with pip this should be the command to go: pip install "lightgbm>=3.3,<4.0", if you installed with conda this command should work: conda install "lightgbm>=3.3,<4.0"

I will update the requirements asap

Gullumluvl commented 1 year ago

Right, I forgot to provide my lightgbm version... 4.0.0 indeed.

The downgrade works, and it gives 0.16 as above for the example alignment. Thanks for the fix!

An additional note on v4.0.0: it shows these warnings:

[LightGBM] [Warning] lambda_l2 is set=1.593402265932751e-08, reg_lambda=0.0 will be ignored. Current value: lambda_l2=1.593402265932751e-08
[LightGBM] [Warning] lambda_l1 is set=0.0002391390668080122, reg_alpha=0.0 will be ignored. Current value: lambda_l1=0.0002391390668080122
[LightGBM] [Warning] bagging_fraction is set=0.9660277131424135, subsample=1.0 will be ignored. Current value: bagging_fraction=0.9660277131424135
[LightGBM] [Warning] bagging_freq is set=7, subsample_freq=0 will be ignored. Current value: bagging_freq=7
tschuelia commented 1 year ago

Perfect :-) I pinned the version in the new 1.1.3 release for now and I will see if I can support LGB 4.0.0 in the future and figure out what causes the warnings...