interpretml / DiCE

Generate Diverse Counterfactual Explanations for any machine learning model.
https://interpretml.github.io/DiCE/
MIT License
1.35k stars 186 forks source link

Is it possible to limit DiCE to only vary one feature at a time for each counterfactual? #186

Open orbidder opened 3 years ago

orbidder commented 3 years ago

Really enjoying this library, so thanks to everyone who has worked on it. I know that we can use the 'features_to_vary' arg to limit DiCE to produce counterfactuals from a subset of features, but is it possible to limit the number of features varied in any given counterfactual? For example, in the credit case study given in the docs, say I want to generate counterfactuals from the features age and hours worked, but for each counterfactual, DiCE is constrained to only changing the value of one of these features (so both are options but only one can be used). Would that be possible currently? I want to avoid cases where I have a multi-level categorical feature coded by one-hot-encoding, and DiCE suggests changing more than one level from 0 -> 1. Say I have 'red', 'green' and 'blue', red is omitted as the reference class, in some cases DiCE suggests both green 0->1 and blue 0->1 for a single counterfactual, but it is not possible for the entity to be both classes at the same time so this counterfactual is not valid. Currently I filter these manually after running DiCE, but it seems to be wasteful computationally to do this.

gaugup commented 3 years ago

@orbidder, thanks for your question. Dice-ml library handles string categoricals. So you don't need to one hot encode the data and then train dice-ml. You need to identify the numeric/contnous features in your data and dice-ml library treats other features as categoricals and handles them accordingly. Could you give this a try?

I don't see a string reason to restrict varying of features one a time. It may so happen that just by varying one feature at a time doesn't produce any counterfactual examples based on your model.

Regards,

Saladino93 commented 3 years ago

@gaugup I also tried to vary only one feature at a time, with no success, just waiting, for the adult data example. I do not know if this is a bug, or the optimization problem.

orbidder commented 3 years ago

@gaugup, understood that Dice-ml can handle string categoricals, but not all models can. So for those cases, being able to limit to only one change would help for these models. For the time being, my idea for a work around is to manually code a loop where generate_counterfactuals() is given only one of the one-hot-encoded levels to vary at a time. Perhaps that's a short term solution.

To your second point about not producing counterfactual examples when varying one feature at a time, that would be fine. I don't think dice-ml should generate counterfactuals when it can't find valid ones. For example, switching two one-hot-encoded features to 1 when in reality the entity can't exist in this state (e.g. it can be red or blue, but it cannot be both). In this case it would be more helpful if dice-ml concluded there were no valid counterfactuals, rather than suggest impossible ones. Don't you think?

Perhaps an option to define which features are in these one-hot-encoded feature groups would be useful. The user could pass a dict in which each value is a list of strings that identify the columns that belong to the same feature. dice-ml could then only vary one of the features in this group at a time?

Saladino93 commented 3 years ago

@gaugup @orbidder Note that here https://arxiv.org/abs/2011.04917 (Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End) they are able to do one by one (as they generate the necessity and sufficiency metrics) right @amit-sharma ?

I coded these for myself, and when I wanted to try on the adult data I was not able, due to dice not generating anything (note, I use genetic algorithm, maybe I should use something else, but I do not trust too much the random way :) )

orbidder commented 3 years ago

@Saladino93 Thanks for sharing that paper. It would be really useful if dice-ml could also provide a function for calculating feature Necessity and Sufficiency too. I think I'd use that a bunch in my work.

amit-sharma commented 3 years ago

Slightly late to this discussion, let me chime in.

  1. The original issue about supporting post-hoc filtering is a good one. It isn't required if the model supports categorical variables, but becomes a problem with one-hot encoded data as @orbidder mentions. However, rather than adding a solution for this specific filter, I wonder if we can think broadly about a general API for post-hoc filters. It will be good to enumerate the different kinds of filters that may be useful. I can imagine a list of helper functions that cater to the most common filters, but also allow users to create their own filter. @orbidder @Saladino93 do you have more post-hoc filters in mind?

  2. Sometimes, DiCE may not generate CFs using a single variable because it requires a big change for the particular feature value in the input point. @Saladino93 can you try the same code for multiple data inputs in Adult dataset? My guess is that it should be easier to generate counterfactuals for some inputs versus others.

  3. For necessity and sufficiency, actually random is the best algorithm to generate an unbiased measure of necessity/sufficiency without the estimate being influenced by the CF algorithm. But it will be inefficient in time complexity. Great to hear your interest in these metrics--we are refining these metrics in some followup work, these will be implemented in a future release.

Saladino93 commented 3 years ago

@amit-sharma Sure, will do this. So, just to understand, what you did in your paper was to use the random algorithm then? I always thought it is better to NOT use that as you might lose correlations of the original dataset.

As regarding the post-filtering, what I did is the following:

Define a Filter class with some tabu_changes.

For example:

tabu_changes = {} tabu_changes['education'] = [['Master', 'Bachelor']] , this means you can not go from Master to Bachelor

Then you apply the Filter class on your generated counterfactuals, for each example, and it gives you the minimum number of CFs for which each example respects the relations.

I also implemented tabu_changes where you have couple of features, like education and age, and you want to maintain certain relations. It can also accept functions (actually one might define functions for categorical features too, by defining some order).

This could be a nice feature to add on DICE (although my understanding is that you are already working on it? I can also show my implementation)