NorskRegnesentral / shapr

Explaining the output of machine learning models with more accurately estimated Shapley values
https://norskregnesentral.github.io/shapr/
Other
138 stars 32 forks source link

Parallelization #38

Open martinju opened 5 years ago

martinju commented 5 years ago

We should add 4 arguments for parallelization:

  1. One concerning parallelization of the predictions method (which is passed to prediction_vector)
  2. One for parallelization of the sampling method when either the Gaussian or copula method is used.
  3. One which concerns parallelization over test samples in compute_kshap
  4. One for parallelization of distance computation in prepare_kshap.

We should also add a test checking that either 3 or both of 1 and 2 are set to 1 core to avoid parallelization within parallelizations.

martinju commented 4 years ago

We have implemented parallellization within the prepare_data function of class ctree on the ctree branch, see f8734aa.

martinju commented 3 years ago

Currently, the approach in #244 seems to be the best way to handle this.