awsm-research / PyExplainer

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)
MIT License
28 stars 9 forks source link

Unable to start PyExplainer #24

Open maxstolba opened 1 month ago

maxstolba commented 1 month ago

Hello,

I have read the paper "PyExplainer: Explaining the Predictions ofJust-In-Time Defect Models" (ASE2021) in detail and was trying to reproduce the artefact presented. However I could not get the PyExplainer to run as suggested in the paper and repo.

I attempted:

Are these errors resolvable?

Running the tutorial on the virtual instance on binder I get this error after only running the imports:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[/tmp/ipykernel_335/2305279675.py](https://hub.ovh2.mybinder.org/tmp/ipykernel_335/2305279675.py) in <cell line: 1>()
----> 1 from pyexplainer import pyexplainer_pyexplainer
      2 from sklearn.ensemble import RandomForestClassifier

[~/pyexplainer/pyexplainer_pyexplainer.py](https://hub.ovh2.mybinder.org/user/awsm-research-pyexplainer-6cdb2pqo/lab/tree/pyexplainer/pyexplainer_pyexplainer.py) in <module>
     16 from sklearn.utils import check_random_state, all_estimators
     17 from .rulefit import RuleFit
---> 18 from statsmodels.stats.outliers_influence import variance_inflation_factor
     19 from statsmodels.tools.tools import add_constant
     20 import pickle

[/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/stats/outliers_influence.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/stats/outliers_influence.py) in <module>
     14 from statsmodels.compat.pandas import Appender
     15 from statsmodels.graphics._regressionplots_doc import _plot_influence_doc
---> 16 from statsmodels.regression.linear_model import OLS
     17 from statsmodels.stats.multitest import multipletests
     18 from statsmodels.tools.decorators import cache_readonly

[/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/regression/__init__.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/regression/__init__.py) in <module>
----> 1 from .linear_model import yule_walker
      2 
      3 from statsmodels.tools._testing import PytestTester
      4 
      5 __all__ = ['yule_walker', 'test']

[/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/regression/linear_model.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/regression/linear_model.py) in <module>
     44 from statsmodels.tools.decorators import (cache_readonly,
     45                                           cache_writable)
---> 46 import statsmodels.base.model as base
     47 import statsmodels.base.wrapper as wrap
     48 from statsmodels.emplike.elregress import _ELRegOpts

[/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/base/model.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/base/model.py) in <module>
     14                                           cached_value, cached_data)
     15 import statsmodels.base.wrapper as wrap
---> 16 from statsmodels.tools.numdiff import approx_fprime
     17 from statsmodels.tools.sm_exceptions import ValueWarning, \
     18     HessianInversionWarning

[/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/tools/numdiff.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/statsmodels/tools/numdiff.py) in <module>
     49 
     50 # NOTE: we only do double precision internally so far
---> 51 EPS = np.MachAr().eps
     52 
     53 _hessian_docs = """

[/srv/conda/envs/notebook/lib/python3.8/site-packages/numpy/__init__.py](https://hub.ovh2.mybinder.org/srv/conda/envs/notebook/lib/python3.8/site-packages/numpy/__init__.py) in __getattr__(attr)
    318             return Tester
    319 
--> 320         raise AttributeError("module {!r} has no attribute "
    321                              "{!r}".format(__name__, attr))
    322 

AttributeError: module 'numpy' has no attribute 'MachAr'

Here is the error output: After creating a new venv in a new folder on the desktop I entered the following commands in my terminal:

python3 -m venv venv
source venv/bin/activate
pip install pyexplainer

Which produces the following error:

$> pip3 install pyexplainer

Collecting pyexplainer
  Using cached pyexplainer-1.2.0-py3-none-any.whl.metadata (883 bytes)
Collecting ipython<8.0.0,>=7.16.0 (from pyexplainer)
  Using cached ipython-7.34.0-py3-none-any.whl.metadata (4.3 kB)
Collecting ipywidgets<8.0.0,>=7.6.3 (from pyexplainer)
  Using cached ipywidgets-7.8.1-py2.py3-none-any.whl.metadata (1.9 kB)
Collecting numpy<2.0.0,>=1.19.0 (from pyexplainer)
  Using cached numpy-1.26.4-cp312-cp312-macosx_11_0_arm64.whl.metadata (61 kB)
Collecting pandas<2.0.0,>=1.1.0 (from pyexplainer)
  Using cached pandas-1.5.3.tar.gz (5.2 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting scikit-learn<0.25.0,>=0.24.2 (from pyexplainer)
  Using cached scikit-learn-0.24.2.tar.gz (7.5 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      <string>:17: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
      Partial import of sklearn during the build process.
      Traceback (most recent call last):
        File "/Users/maximilianstolba/testingPyExplainer/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/Users/maximilianstolba/testingPyExplainer/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/maximilianstolba/testingPyExplainer/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/24/z6nylbnn5b36bfq3_tvw77c80000gn/T/pip-build-env-eg7ox2f9/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 368, in prepare_metadata_for_build_wheel
          self.run_setup()
        File "/private/var/folders/24/z6nylbnn5b36bfq3_tvw77c80000gn/T/pip-build-env-eg7ox2f9/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 497, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/private/var/folders/24/z6nylbnn5b36bfq3_tvw77c80000gn/T/pip-build-env-eg7ox2f9/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 313, in run_setup
          exec(code, locals())
        File "<string>", line 301, in <module>
        File "<string>", line 293, in setup_package
      ModuleNotFoundError: No module named 'numpy.distutils'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Is there maybe an easy fix here? Am I missing an important step here?

My setup: Operating System: MacOS 14.5 (I tried the steps on another machine as well and still got the same errors)

MichaelFu1998-create commented 4 weeks ago

Hello @maxstolba, the issue is caused by Numpy 1.24 a quick workaround is to run !pip install numpy==1.23.5 in your notebook, then restart the notebook, you should be good to go.