interpretml / DiCE

Generate Diverse Counterfactual Explanations for any machine learning model.
https://interpretml.github.io/DiCE/
MIT License
1.33k stars 184 forks source link

High rate of incorrect predictions #12

Closed sina-salek closed 4 years ago

sina-salek commented 4 years ago

Hello,

I'm not sure if this is something I'm doing wrong or you've encountered this before. Consider the following actions:

In the adult dataset I put aside a validation set, which neither my model nor DiCE has seen. From those, I select a number of samples whose age = 31, and use those to generate counterfactuals by varying everything other than age. Other than this I am not using any weights. After that, I use my model to get a prediction on these generated counterfactuals. In a lot of the cases my model's prediction is not what DiCE thinks it would give.

Is there any way for me to increase DiCE's fidelity to my model?

Thanks

raam93 commented 4 years ago

That's not possible, DiCE explanations are always truthful to the ML model by definition. In other words, we are simply tweaking the input instance until we get a different prediction from the same ML model. It is difficult to say anything more without looking at your code. Perhaps, you are missing out on something while creating a validation set. For instance, are you normalizing the continuous features and one-hot-encoding the categorical features in the validation data? You can use DiCE's data interface to create a validation set as follows

dataset = helpers.load_adult_income_dataset()
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')
train, test = d.split_data(d.normalize_data(d.one_hot_encoded_data))
X_test = test.loc[:, test.columns != 'income']
y_test = test.loc[:, test.columns == 'income']

For your reference, I have included a sample code implementing your logic that gave me valid results.

import dice_ml
from dice_ml.utils import helpers

import tensorflow as tf
from tensorflow import keras

print(tf.__version__) # 2.1.0

# creating a testing dataset - the inbuilt ML model in DiCE for adult data is trained only on the 'train' data below
dataset = helpers.load_adult_income_dataset()
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')
train, test = d.split_data(d.normalize_data(d.one_hot_encoded_data))
X_test = test.loc[:, test.columns != 'income']
y_test = test.loc[:, test.columns == 'income']

# get normalized age=31
normalized_age = (31-d.train_df['age'].min())/((d.train_df['age'].max()-d.train_df['age'].min())) # should print 0.1917808219178082

# we can verify if the above number is correct using the following
# (normalized_age*(d.train_df['age'].max() - d.train_df['age'].min())) + d.train_df['age'].min() # should give you 31

my_test = X_test[X_test['age']==normalized_age]
print(my_test.shape) # (187,29)
# there are 187 instances with age =31 in our data, I'm choosing the first one below as an example.

# create a test instance dictionary
my_test_instance = {}
for feature in d.feature_names:
    if feature in d.continuous_feature_names:
        my_test_instance[feature] = (my_test[feature].iloc[0]*(d.train_df[feature].max() - d.train_df[feature].min())) + d.train_df[feature].min()
    else:
        encoded_features = [feat for feat in d.encoded_feature_names if feat.startswith(feature)]
        for encoded_feat in encoded_features:
            if my_test.iloc[0][encoded_feat] == 1.0:
                my_test_instance[feature] = encoded_feat.split(feature+'_')[1]

print(my_test_instance)
# {'age': 31.0,
#  'workclass': 'Private',
#  'education': 'HS-grad',
#  'marital_status': 'Single',
#  'occupation': 'Blue-Collar',
#  'race': 'White',
#  'gender': 'Female',
#  'hours_per_week': 40.0}

d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')

backend = 'TF'+tf.__version__[0] # TF2
ML_modelpath = helpers.get_adult_income_modelpath(backend=backend)
m = dice_ml.Model(model_path= ML_modelpath, backend=backend)

exp = dice_ml.Dice(d, m)

# changing every feature except age
dice_exp = exp.generate_counterfactuals(my_test_instance, total_CFs=4, desired_class="opposite", features_to_vary=['workclass', 'education', 'marital_status', 'occupation', 'race', 'gender', 'hours_per_week'])

# visualize the results
dice_exp.visualize_as_list(show_only_changes=True) # prints the following
# Query instance (original outcome : 0)
# [31.0, 'Private', 'HS-grad', 'Single', 'Blue-Collar', 'White', 'Female', 40.0, 0.019464194774627686]
#
# Diverse Counterfactual set (new outcome : 1)
# ['-', 'Self-Employed', '-', 'Married', 'White-Collar', '-', '-', 48.0, 0.75]
# ['-', '-', 'Doctorate', 'Married', '-', '-', '-', 26.0, 0.697]
# ['-', '-', 'Masters', 'Married', '-', '-', '-', '-', 0.749]
# ['-', '-', 'Prof-school', 'Married', '-', '-', '-', 58.0, 0.858]

# To check that the predictions are indeed equal
for ix, cf in enumerate(exp.final_cfs):
    model_pred = exp.predict_fn(cf)
    cf_pred = exp.cfs_preds[ix]
    print(model_pred, cf_pred)
# prints the following
# [[0.75035185]] [[0.75035185]]
# [[0.69670826]] [[0.69670826]]
# [[0.73952144]] [[0.73952144]]
# [[0.85771877]] [[0.85771877]]
sina-salek commented 4 years ago

Thanks! It's very kind of you to provide the code. It helped me find my bug.