interpretml / interpret

Fit interpretable models. Explain blackbox machine learning.
https://interpret.ml/docs
MIT License
6.28k stars 729 forks source link

Added APLR to the benchmark file #567

Closed mathias-von-ottenbreit closed 2 months ago

mathias-von-ottenbreit commented 2 months ago

Added APLR to the benchmark file. If you run this then I would appreciate to receive a copy of the results. On my end I am only able to download the pmlb datasets using Powerlift (otherwise I get error messages).

Also made a small update to the APLR documentation and bumbed the APLR version requirement from 10.5.1 to 10.6.1.

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 76.55%. Comparing base (c637cf4) to head (39aaf93).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #567 +/- ## ======================================== Coverage 76.55% 76.55% ======================================== Files 72 72 Lines 8988 8988 ======================================== Hits 6881 6881 Misses 2107 2107 ``` | [Flag](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | Coverage Δ | | |---|---|---| | [bdist_linux_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.22% <ø> (+0.02%)` | :arrow_up: | | [bdist_linux_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.01% <ø> (-0.17%)` | :arrow_down: | | [bdist_linux_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.14% <ø> (-0.12%)` | :arrow_down: | | [bdist_linux_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.20% <ø> (ø)` | | | [bdist_mac_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.31% <ø> (-0.11%)` | :arrow_down: | | [bdist_mac_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.31% <ø> (-0.08%)` | :arrow_down: | | [bdist_mac_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `?` | | | [bdist_mac_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.36% <ø> (-0.04%)` | :arrow_down: | | [bdist_win_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.34% <ø> (-0.11%)` | :arrow_down: | | [bdist_win_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.45% <ø> (+0.11%)` | :arrow_up: | | [bdist_win_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.34% <ø> (-0.11%)` | :arrow_down: | | [bdist_win_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.33% <ø> (-0.08%)` | :arrow_down: | | [sdist_linux_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.05% <ø> (-0.08%)` | :arrow_down: | | [sdist_linux_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `75.94% <ø> (-0.19%)` | :arrow_down: | | [sdist_linux_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.05% <ø> (-0.03%)` | :arrow_down: | | [sdist_linux_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `75.94% <ø> (-0.19%)` | :arrow_down: | | [sdist_mac_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.34% <ø> (ø)` | | | [sdist_mac_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.12% <ø> (-0.19%)` | :arrow_down: | | [sdist_mac_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.20% <ø> (-0.15%)` | :arrow_down: | | [sdist_mac_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.18% <ø> (-0.14%)` | :arrow_down: | | [sdist_win_310_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.31% <ø> (-0.06%)` | :arrow_down: | | [sdist_win_311_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.42% <ø> (-0.04%)` | :arrow_down: | | [sdist_win_312_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.45% <ø> (+0.08%)` | :arrow_up: | | [sdist_win_39_python](https://app.codecov.io/gh/interpretml/interpret/pull/567/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml) | `76.38% <ø> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=interpretml#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

paulbkoch commented 2 months ago

Looks really good @mathias-von-ottenbreit. I'm really enjoying how clean your PRs are.

I'll let you know next time I run it. The notebook was taking 12-15 hours to run before APLR, so I only run it occasionally overnight.

What was the error message you were getting?

mathias-von-ottenbreit commented 2 months ago

Thanks @paulbkoch.

Regarding error messages, I looked at this again and it seems to work with the exact versions of the dependencies listed in the benchmark notebook.

However, pip install powerlift[datasets, postgres] gives the below error on my side and I had to install it without the postgres.

(test2) (base) mathiaso@mathias-computer:~/Documents/mathias/git_projects/interpret_fork$ pip install -U --quiet numpy==1.26.4 pandas==2.2.2 scikit-learn==1.5.1 xgboost==2.1.0 lightgbm==4.5.0 catboost==1.2.5 aplr==10.6.1 interpret-core powerlift[datasets,postgres] error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [23 lines of output] running egg_info creating /tmp/pip-pip-egg-info-h6m2k6y5/psycopg2.egg-info writing /tmp/pip-pip-egg-info-h6m2k6y5/psycopg2.egg-info/PKG-INFO writing dependency_links to /tmp/pip-pip-egg-info-h6m2k6y5/psycopg2.egg-info/dependency_links.txt writing top-level names to /tmp/pip-pip-egg-info-h6m2k6y5/psycopg2.egg-info/top_level.txt writing manifest file '/tmp/pip-pip-egg-info-h6m2k6y5/psycopg2.egg-info/SOURCES.txt'

  Error: pg_config executable not found.

  pg_config is required to build psycopg2 from source.  Please add the directory
  containing pg_config to the $PATH or specify the full executable path with the
  option:

      python setup.py build_ext --pg-config /path/to/pg_config build ...

  or with the pg_config option in 'setup.cfg'.

  If you prefer to avoid building psycopg2 from source, please install the PyPI
  'psycopg2-binary' package instead.

  For further information please check the 'doc/src/install.rst' file (also at
  <https://www.psycopg.org/docs/install.html>).

  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details.

paulbkoch commented 2 months ago

psycopg2 is picky and requires additional C compiler dependencies. Try pip installing psycopg2-binary instead and drop the postgres flag on the pip install powerlift[datasets]

paulbkoch commented 2 months ago

I ran the notebook overnight. 4 of the datasets timed out in APLR after 6 hours. 1 dataset failed in what looks like scikit-learn one hot encoding since it was going to require 40GB.

The table of results is below. APLR seems to perform really well! Please note though that the multiclass and regression results would change if the remaining 5 datasets were included. This notebook doesn't filter results to globally completed datasets, so the averages below are across different datasets.

      method     RANK      auc  multi_auc    nrmse  log_loss_RANK  cross_entropy_RANK  nrmse_RANK   fit_time  predict_time
    ebm-base 1.780952 0.890061   0.947850 1.171035       1.600000            1.837838    1.909091 629.838994      0.151726
   aplr-base 2.050000 0.880044   0.947816 1.117040       1.971429            2.181818    2.000000 702.055842      0.980468
xgboost-base 2.123810 0.881180   0.955046 1.183892       2.428571            1.891892    2.060606  41.168387      0.242283

      method  RANK  auc  multi_auc  nrmse  log_loss  cross_entropy  fit_time  predict_time
    ebm-base   105   35         37     33        35             37       105           105
   aplr-base   100   35         33     32        35             33       100           100
xgboost-base   105   35         37     33        35             37       105           105

The exact error on the dataset that failed due to memory was:

ERROR: Traceback (most recent call last): File "//startup.py", line 54, in runtrials , duration, timed_out = timed_run( ^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/powerlift/executors/base.py", line 62, in timed_run res = f() ^^^ File "//startup.py", line 55, in lambda: trial_run_fn(trial), timeout_seconds=timeout ^^^^^^^^^^^^^^^^^^^ File "", line 145, in wired_function File "/usr/local/lib/python3.12/site-packages/sklearn/base.py", line 1473, in wrapper return fit_method(estimator, *args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 469, in fit Xt = self._fit(X, y, routed_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 406, in _fit X, fitted_transformer = fit_transform_one_cached( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/joblib/memory.py", line 312, in call return self.func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 1310, in _fit_transform_one res = transformer.fit_transform(X, y, params.get("fit_transform", {})) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/base.py", line 1473, in wrapper return fit_method(estimator, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 533, in fit_transform Xt = self._fit(X, y, routed_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 406, in _fit X, fitted_transformer = fit_transform_one_cached( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/joblib/memory.py", line 312, in call return self.func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/pipeline.py", line 1310, in _fit_transform_one res = transformer.fit_transform(X, y, params.get("fit_transform", {})) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/utils/_set_output.py", line 313, in wrapped data_to_wrap = f(self, X, args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/base.py", line 1473, in wrapper return fit_method(estimator, args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/compose/_column_transformer.py", line 1006, in fit_transform return self._hstack(list(Xs), n_samples=n_samples) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/sklearn/compose/_column_transformer.py", line 1122, in _hstack Xs = [f.toarray() if sparse.issparse(f) else f for f in Xs] ^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/scipy/sparse/_compressed.py", line 1181, in toarray out = self._process_toarray_args(order, out) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/scipy/sparse/_base.py", line 1301, in _process_toarray_args return np.zeros(self.shape, dtype=self.dtype, order=order) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ numpy.core._exceptions._ArrayMemoryError: Unable to allocate 40.0 GiB for an array with shape (7000000, 767) and data type float64

mathias-von-ottenbreit commented 2 months ago

Thanks for running the benchmarks and sharing the results @paulbkoch. I agree that the error message looks like it was related to scikit-learn one hot encoding. In multiclass classification tasks APLR can take some time to fit because it fits one logit model for each class (whereas in a two-class problem it only fits one logit model).