Contrastive Explanations

Anchors is able to explain any model's decision (e.g. for a label it predicted). However, the explained label does not necessarily have to equal the value the model did actually predict but can be freely chosen.

So, we can force the model to explain a decision it has not made. This would reveil its motivation to classify an instance differently - even though it didn't.

I'd like to start a discussion about how this information could be used.

Surely, visualization is one use-case. Showing some sort of matrix for an explanations that displays which features voted for and which voted against the decision would be possible and helpful. Any more ideas?

viadee / javaAnchorExplainer

Contrastive Explanations #14