christophM / interpretable-ml-book

Book about interpretable machine learning
https://christophm.github.io/interpretable-ml-book/
Other
4.72k stars 1.04k forks source link

Certainty of model predictions and explainers are two different things #328

Open azqanadeem opened 2 years ago

azqanadeem commented 2 years ago

In Section 3.5, Properties of Individual Explanations, "Certainty" only covers the confidence of the ML model, not the confidence of the explainer itself. Both are important: Recent research [1] on adversarial attacks on XAI methods has shown that the model prediction and the explanation can both be targeted, individually and together, however an adversary likes. This implies that in addition to outputting how confident a model is about a certain prediction, it's also important to show how confident the explainer is about the explanation it produces.

[1] Dombrowski, A. K., Alber, M., Anders, C., Ackermann, M., Müller, K. R., & Kessel, P. (2019). Explanations can be manipulated and geometry is to blame. Advances in Neural Information Processing Systems, 32.