JuliaAI / CatBoost.jl

Julia wrapper of the python library CatBoost for boosted decision trees
MIT License
11 stars 3 forks source link

Restrict `numpy<2` and bump actions #38

Closed tylerjthomas9 closed 3 months ago

tylerjthomas9 commented 3 months ago

Ref https://github.com/JuliaAI/CatBoost.jl/issues/37

I also went ahead and bumped the actions to prevent issues down the road (since this library is less actively developed).

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 75.55%. Comparing base (28930d4) to head (64c95d7). Report is 1 commits behind head on main.

:exclamation: Current head 64c95d7 differs from pull request most recent head b6458f9

Please upload reports for the commit b6458f9 to get more accurate results.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #38 +/- ## ======================================= Coverage 75.55% 75.55% ======================================= Files 6 6 Lines 180 180 ======================================= Hits 136 136 Misses 44 44 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tylerjthomas9 commented 3 months ago

@ablaom Restricting numpy seemed to fix the issue

ablaom commented 3 months ago

Great work and quick diagnosis! Many thanks.

Is this an issue that should be fixed at PythonCall.jl (it is already tagged here) or is this "expected behaviour" and we just merge your fix (here and in all the other effected packages)?

tylerjthomas9 commented 3 months ago

I did not realize that it broke PythonCall.jl. Ideally, it would be fixed there, and we could pin a minimum PythonCall.jl version.

tylerjthomas9 commented 3 months ago

I am struggling to reproduce the numpy issues with just PythonCall.jl. However, I can't import catboost from the created Python environment. Our error is happening when importing catboost in Python. I have tried pip and conda installing catboost, and neither one seems to work.

Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import catboost
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/jl_mlbQrt/.CondaPkg/env/lib/python3.12/site-packages/catboost/__init__.py", line 1, in <module>
    from .core import (
  File "/tmp/jl_mlbQrt/.CondaPkg/env/lib/python3.12/site-packages/catboost/core.py", line 45, in <module>
    from .plot_helpers import save_plot_file, try_plot_offline, OfflineMetricVisualizer
  File "/tmp/jl_mlbQrt/.CondaPkg/env/lib/python3.12/site-packages/catboost/plot_helpers.py", line 5, in <module>
    from . import _catboost
  File "_catboost.pyx", line 1, in init _catboost
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
>>> 
tylerjthomas9 commented 3 months ago

https://github.com/catboost/catboost/issues/2671

It looks like catboost might restrict numpy to 1.x until the packages are rebuilt for numpy 2.0. I feel like we should just restrict numpy until it is resolved.