interpretml / DiCE

Generate Diverse Counterfactual Explanations for any machine learning model.
https://interpretml.github.io/DiCE/
MIT License
1.36k stars 188 forks source link

Saving the generated examples as a dataframe #119

Closed Arnims closed 3 years ago

Arnims commented 3 years ago

Hello everyone,

I am looking for a way to export the generated examples to something like a pandas-dataframe. Maybe, I am missing something but I tried to do so for sometime now but could not find a solution. I think and hope that this is somehow possible. The output generated with the ".visualize_as_dataframe()" looks already like a dataframe but I do not see a way to save the displayed examples. I also tried the "to_json()"-method. However, I could no figure out yet, how to get from there to a dataframe.

I am grateful for any suggestion. Best regards, Arnim

gaugup commented 3 years ago

@Arnims,

Which class are you using? Is it CounterfactualExamples or CounterfactualExplanations(). The CounterfactualExamples has 'final_cfs_df' field that carries the counterfactual examples as a dataframe. Could you try to use the field final_cfs_df' in CounterfactualExamples .

Regards,

noureini commented 3 years ago

@gaugup have also the same issue. The documentation on CounterfactualExamples is not clear.

To generable the counterfactuals

genetic_lait = exp_genetic_lait.generate_counterfactuals(query_instances_lait, 
                                                             total_CFs=3, 
                                                             features_to_vary=features_vary,
                                                             proximity_weight=2.5, diversity_weight=0.01,
                                                             desired_range=[1,2])
genetic_lait.visualize_as_dataframe(show_only_changes=True)

How to save the generated counterfactuals?

Regards,

Noureini

amit-sharma commented 3 years ago

@noureini @Arnims In the above example, you can use genetic_lait.final_cfs_df to get the pandas dataframe (as mentioned by @gaugup). In case you had enabled the sparsity parameter (available for some methods), you can also access genetic.lait.final_cfs_df_sparse.

Thanks for raising this. We will be adding a property method for you to access the dataframe easily.

Arnims commented 3 years ago

@gaugup @noureini @amit-sharma

Thank you for replies. I used the package to generate counterfactuals like that:

e1 = exp.generate_counterfactuals(X[0:1], total_CFs=2, features_to_vary='all')

The generated object e1 is than of type "CounterfactualExplanations" which does not seem to have the "final_cfs_df"-method. I actually found a workaround without using the dice-package at all but I guess other users might benefit from a simple way to save the counterfactuals. So, thank you for your efforts.

noureini commented 3 years ago

@amit-sharma @Arnims @gaugup Thanks for your responses.

As pointed out by @Arnims, in the example cases, e1 or genetic_lait are of the type "CounterfactualExplanations" which does not have "final_cfss_df". Is there another way? Perhaps by converting the "CounterfactualExplanations" class to the "CounterfactualExamples" class?

@Arnims Can you please share the workaround you applied?

Regards

amit-sharma commented 3 years ago

Ah, in that case, the following should work. e1.cf_examples_list[0].final_cfs_df The CounterfactualExplanations class stores a list of CounterfactualExamples objects for each test input given.

Tested using this code:

e1 = exp.generate_counterfactuals(x_train[0:1], total_CFs=2, desired_class="opposite")
e1.visualize_as_dataframe(show_only_changes=True)
e1.cf_examples_list[0].final_cfs_df

The index [0]: If you've given multiple input points in generate_counterfactuals, then you can use the respective index to get the dataframe for the second input, third input, and so on.

noureini commented 3 years ago

@amit-sharma Thanks, it works.

asha24choudhary commented 9 months ago

Hi @amit-sharma I can save the df using the approach that you mentioned earlier. But was wondering if I can only save only the changes and not the whole df?