dsgibbons / shap

A game theoretic approach to explain the output of any machine learning model.
https://shap-community.readthedocs.io/en/latest/
MIT License
25 stars 5 forks source link

Improve test suite execution speed #42

Closed connortann closed 1 year ago

connortann commented 1 year ago

I think there are a few areas for improvement in the GitHub test suite that we could address to improve the execution speed. Currently the unit tests take almost 20 minutes to run on CI. If we could reduce that it could help reduce the time it takes to validate PRs, improving our effectiveness as reviewers.

TODO

Slowest tests

[Updated] here are the current set of slowest tests:

============================= slowest 20 durations =============================
55.60s call     tests/explainers/test_partition.py::test_translation
48.50s call     tests/explainers/test_partition.py::test_translation_auto
48.44s call     tests/explainers/test_partition.py::test_translation_algorithm_arg
47.71s call     tests/explainers/test_partition.py::test_serialization
46.61s call     tests/explainers/test_partition.py::test_serialization_custom_model_save
43.77s call     tests/explainers/test_partition.py::test_serialization_no_model_or_masker
40.41s call     tests/explainers/test_gradient.py::test_pytorch_mnist_cnn
connortann commented 1 year ago

Caching dependencies: experiment notes

Comparison of various options I've tried for caching dependencies.

Nb. we can see & manage caches via UI: https://github.com/dsgibbons/shap/actions/caches

Repository caches limited to 10GB.

Timings

Env Baseline 1: Cache pip 2: Cache whole env 3: Cache some libs
py3.7 4m 14s 1m 34s 3m 15s
py3.8 5m 6s 1m 50s 3m 4s
py3.9 4m 25s 4m 34s 2m 25 2m 56s
py3.10 4m 30s 4m 41s 1m 44s 2m 51s
py3.11 4m 42s 5m 17s 2m 42s 2m 51s
Average 4m 35s 4m 50s 2m 3s 3m 35s

Approaches

0. Baseline

Existing approach, just pip-install with no caching.

1. Enable cache in the setup python action.

Caches the wheels, but not the installed environment. As per the action docs.

2. Cache the whole python env

As per this blog

3. Cache specific libraries in site-packages

Cache only the libraries which need to be built, such as pyspark. Leave other libs to be pip-installed as before

To decide which packages to cache: we want to save the most time, whilst keeping under ~2GB total cache size per env. Some calculations from experimentation, sorted by those that save the most time for the least space:

Package size (MB) built time (s) s / MB
site-packages/pyspark* 310 12s 0.039
site-packages/nvidia* 1521 40s 0.026
site-packages/torch* 619 13s 0.021
site-packages/tensorflow* 586 12s 0.020
site-packages/xgboost* 200 4s 0.020

So, decide to cache just the first 3 libraries. In future if we drop support for any python versions, we can cache more libraries.

Implementing options on PR #84 .

connortann commented 1 year ago

Ported to https://github.com/slundberg/shap/issues/3045