thomasp85 / lime

Local Interpretable Model-Agnostic Explanations (R port of original Python package)
https://lime.data-imaginist.com/
Other
481 stars 109 forks source link

Family in glmnet is always gaussian #190

Open mirka-henninger opened 2 years ago

mirka-henninger commented 2 years ago

When applying LIME to simulated data with a binary outcome, LIME results do not always match the data generating process. This arises because the family argument in the call to glmnet is set to gaussian by default and does not reflect the model type (classification versus regression). See e.g., https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L48 and https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L56

I was wondering whether this is intentional/documented somewhere? As one possible fix, one could add a family argument to the model_permutations function that then can be used in the glm.fit and glmnet function calls. If you'd be willing to add a corresponding PR, I could prepare one.