elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
18 stars 98 forks source link

Use compatible versions of numpy and shap #539

Closed davidkyle closed 1 year ago

davidkyle commented 1 year ago

Tests are failing with the error

E           AttributeError: module 'numpy' has no attribute 'bool'.
E           `np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'bool_'?

This is coming from the Shap explainer code using the deprecated np.bool attribute.

.nox/test-3-10-pandas_version-1-5-0/lib/python3.10/site-packages/shap/explainers/_tree.py:384: in shap_values
    X, y, X_missing, flat_output, tree_limit, check_additivity = self._validate_inputs(X, y,
.nox/test-3-10-pandas_version-1-5-0/lib/python3.10/site-packages/shap/explainers/_tree.py:250: in _validate_inputs
    X_missing = np.isnan(X, dtype=np.bool)

Version 1.24 of numpy is incompatible with Shap 0.41 because of this. There is a PR to fix the problem in Shap but it has not been merged yet.

https://github.com/slundberg/shap/pull/1890 https://github.com/slundberg/shap/pull/1890/files#diff-6be3a33007583066a03d3a9faaed8b7a60713a03145f924fe7eea64bacaa22af

Pandas

Pandas 2.0 was released in April 2023 it seems prudent to restrict the Pandas version to <2 to avoid any incompatibilities.

The Numpy install is probably coming from Pandas so a different fix may be required