NorskRegnesentral / shapr

Explaining the output of machine learning models with more accurately estimated Shapley values
https://norskregnesentral.github.io/shapr/
Other
138 stars 32 forks source link

Comparing different dependency-aware approaches when the size of x_explain is large #387

Closed aliamini-uq closed 2 months ago

aliamini-uq commented 2 months ago

Dear @martinju,

Thanks again for introducing shapr package. Inspired by Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features, I want to compare the performance of different dependency modeling approaches with Abalone dataset. In this paper, the sizes of x_train and x_explain are 4077 and 100, respectively. However, I want to increase the size of x_explain to 1200. Based on issue #370, it seems there is a limit on the size of x_explain equal to 200. Would you please help me what should I do? My ultimate goal is to compare the performance of different approaches (e.g., independence, ctree, vaeac, and empirical) using plot_MSEv_eval_crit(). Last but not least, I am most grateful for your priceless time in advance.

Kind regards, A

aliamini-uq commented 2 months ago

Dear @martinju,

I will be honored if you take a look at my problem.

Kind regards, A

martinju commented 2 months ago

There is no limit on the size of x_explain. Just increase n_batches accordingly and should be fine (in theory). If you want to be on the safe side (not risiking losing anything) you can pass just a few x_explain at a time.