edden-gerber / radical-shapley-values

Python code to directly compute "radical" Shapley values for model features, by re-training the model on a subset of features on each iteration.
https://towardsdatascience.com/a-new-perspective-on-shapley-values-the-radical-shapley-method-6c2f4af7f922
9 stars 6 forks source link

Calculation fo shapley values for samples (no model features) #1

Open Tato14 opened 3 years ago

Tato14 commented 3 years ago

I am struggling with a problem that (I think) would need to calculate shapley values for samples rather than features. For this, I end up in your blog post about Naïve Shapley method calculation (great resource!).

However, I am not quite sure how to implement this:

 I have a pool of samples from s1 to s100 that I want to classify between two different categories A and B. However, in this problem I cannot perform predictions individually for each sample but in groups of 10 and every prediction returns the predicted label and the confidence for each label. Something like:

Sample_group;Prediction;Confidence
[s1,s21,s3,s15,s5,s62,s90,s13,s9,s100];A;0.9
[s1,s5,s12,s20,s53,s89,s27,s42,s76,s55];A;0.4
...

Is there any way to calculate shapley values from this combinations?

I also make a similar question in stackoverflow in case you want to answer there. Thanks!

sb-edden-gerber commented 3 years ago

Hey, it's an interesting problem, I gave a detailed suggestion in stackoverflow.