neurodata / treeple

Scikit-learn compatible decision trees beyond those offered in scikit-learn
https://treeple.ai
Other
64 stars 14 forks source link

Getting `TypeError check_array() got an unexpected keyword argument 'ensure_all_finite'` causing many tests to fail #314

Closed ryanhausen closed 1 month ago

ryanhausen commented 2 months ago

Checklist

Description

It looks like something has changed is the check_array() API in the underlying sklearn fork and now the marjority of tests fail for me. I might be installing it incorrectly, but I am not sure where the mistake is.

Python traceback:

``` treeple/tree/_classes.py:231: in fit sim_mat = self.compute_similarity_matrix(X) treeple/tree/_neighbors.py:66: in compute_similarity_matrix return compute_forest_similarity_matrix(self, X) treeple/tree/_neighbors.py:30: in compute_forest_similarity_matrix X_leaves = forest.apply(X)[:, np.newaxis] treeple/_lib/sklearn/tree/_classes.py:868: in apply X = self._validate_X_predict(X, check_input) treeple/_lib/sklearn/tree/_classes.py:645: in _validate_X_predict X = self._validate_data( TypeError: check_array() got an unexpected keyword argument 'ensure_all_finite' ```

Related issues or possible duplicates

Environment

OS: Linux

Python version: 3.9.19

Output of pip freeze:

``` Bottleneck==1.4.0 build==1.2.1 click==8.1.7 cloudpickle==3.0.0 colorama @ file:///home/conda/feedstock_root/build_artifacts/colorama_1666700638685/work coverage==7.6.1 Cython==3.0.11 doit==0.36.0 exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1720869315914/work flaky==3.8.1 importlib_metadata==8.2.0 iniconfig @ file:///home/conda/feedstock_root/build_artifacts/iniconfig_1673103042956/work joblib @ file:///home/conda/feedstock_root/build_artifacts/joblib_1714665484399/work markdown-it-py==3.0.0 mdurl==0.1.2 memory-profiler==0.61.0 meson==1.5.1 meson-python==0.16.0 ninja==1.11.1.1 numpy==2.0.1 packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1718189413536/work pandas==2.2.2 pluggy @ file:///home/conda/feedstock_root/build_artifacts/pluggy_1713667077545/work psutil==6.0.0 pydevtool==0.3.0 Pygments==2.18.0 pyproject-metadata==0.8.0 pyproject_hooks==1.1.0 pytest @ file:///home/conda/feedstock_root/build_artifacts/pytest_1721923606331/work pytest-cov==5.0.0 python-dateutil==2.9.0.post0 pytz==2024.1 rich==13.7.1 rich-click==1.8.3 scikit-learn==1.5.1 scipy==1.13.1 six==1.16.0 spin==0.11 threadpoolctl @ file:///home/conda/feedstock_root/build_artifacts/threadpoolctl_1714400101435/work tomli @ file:///home/conda/feedstock_root/build_artifacts/tomli_1644342247877/work tqdm==4.66.5 typing_extensions==4.12.2 tzdata==2024.1 zipp==3.20.0 ```

Steps to reproduce

Example source:

``` git clone git@github.com:neurodata/treeple.git cd treeple conda create -n treeple python=3.9 conda activate treeple python -m pip install -r build_requirements.txt conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp ./spin build python -m pip install -r test_requirements.txt ./spin test ```

adam2392 commented 2 months ago

This is due to main treeple not being compatible with scikit-learn pip released version. But when they release another version, we can do a release as well.

ryanhausen commented 2 months ago

That makes sense, would it make sense to change the developer/contributor docs to install the scikit learn from the main then?

I am happy to submit a PR you're interested. I think at a high-level, we would drop scikit-learn from the build_requirements.txt and either add another requirements file or a manual pip install with the following:

pip install --pre --extra-index https://pypi.anaconda.org/scientific-python-nightly-wheels/simple scikit-learn

Does that seem right?

adam2392 commented 2 months ago

I would only change the developer documentation

A PR would be great.