mayer79 / flashlight

Machine learning explanations
https://mayer79.github.io/flashlight/
GNU General Public License v2.0
22 stars 4 forks source link

SHAP values for individual vs aggregate profiles #32

Closed asheetal closed 4 years ago

asheetal commented 4 years ago

Dear @mayer79,

I am trying to create a "human readable" models in social sciences. Researchers here understand y = ax + bz + c Where y and x and z were historically averages of multiple observations. It has been a domain idiosyncrasy. From what I see in DALEX, the SHAP scores are for individual profiles. Is it possible to generate additive SHAP scores for aggregate profiles in flashlight to make ML palettable in social sciences? Or is that a mathematical absurdity?

mayer79 commented 4 years ago

Not sure if I can follow.

SHAP/Breakdown decompose one single prediction into contributions of covariables. It is up to you how you call the "observation" behind this prediction. It can be an individual or an aggregate. Technically, the only requirement is that your model is able to make a prediction for this data row.

I think SHAP is an extremely powerful tool in almost all empirical sciences. However: Since SHAP describes a model (and not the underlying data), the model has to be good. By "good", I do not mean that it looks good enough for publication because the right p values are low. I rather mean "good" in the statistical sense: No problematic implicit/explicit overfitting, an appropriate model structure, a crystal clear validation strategy etc. Otherwise, you are just describing a bad model by very good tools.

asheetal commented 4 years ago

Thanks for responding back. A typical example I can give is trying to predict company stock growth using 100-200 features. The keras model learns the data fairly well and does a good job in unseen test. But the model as such is unsable for continuing research unless I provide something that can be represented by a simple additive equation such as this stock_growth = 0.5 * CEO_salary + 0.01 * female_in_board ...(and other 200 vars) The above approximate model is more easily palattable to other researchers who wish to use this as back of napkin calculation for stock_growth but not actually use it for putting money on stocks.

The researchers could then continue this research via experiments or evaluate cluster variances across different industry sectors.

mayer79 commented 4 years ago

It could be that SHAP is not the ideal way to attack this problem. Maybe you could have a look at global surrogate models? I will close the issue as it is not about a code related problem.