id: "are-explanations-helpful-a-comparative-study-of-the-effects-of-explanations-in-ai-assisted-decision-making" aliases:

"Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making" tags:
"TSUNDOKU"
"XAI"

Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making

This paper contributes to the growing literature in empirical evaluation of explainable AI (XAI) methods by presenting a comparison on the effects of a set of established XAI methods in AI-assisted decision making. Specifically, based on our review of previous literature, we highlight three desirable properties that ideal AI explanations should satisfy—improve people’s understanding of the AI model, help people recognize the model uncertainty, and support people’s calibrated trust in the model. Through randomized controlled experiments, we evaluate whether four types of common model-agnostic explainable AI methods satisfy these properties on two types of decision making tasks where people perceive themselves as having different levels of domain expertise in (i.e., recidivism prediction and forest cover prediction). Our results show that the effects of AI explanations are largely different on decision making tasks where people have varying levels of domain expertise in, and many AI explanations do not satisfy any of the desirable properties for tasks that people have little domain expertise in. Further, for decision making tasks that people are more knowledgeable, feature contribution explanation is shown to satisfy more desiderata of AI explanations, while the explanation that is considered to resemble how human explain decisions (i.e., counterfactual explanation) does not seem to improve calibrated trust. We conclude by discussing the implications of our study for improving the design of XAI methods to better support human decision making.
本論文は、説明可能なAI（XAI）手法の実証評価に関する高まる文献に貢献するため、AIによる意思決定支援における一連の確立されたXAI手法の効果に関する比較を提示するものである。具体的には、先行文献のレビューに基づき、理想的なAI説明が満たすべき3つの望ましい特性、すなわち、AIモデルに対する人々の理解を向上させ、人々がモデルの不確実性を認識するのを助け、人々のモデルに対する較正された信頼をサポートすることを強調する。我々は、ランダム化比較実験を通じて、4種類の一般的なモデル不可知論的説明可能なAI手法が、人々が異なるレベルのドメイン専門知識を有すると認識する2種類の意思決定タスク（例：再犯者予測と森林被覆予測）においてこれらの特性を満たすかどうかを評価した。その結果、人々が異なるレベルの専門知識を持つ意思決定タスクでは、AIによる説明の効果は大きく異なり、人々があまり専門知識を持たないタスクでは、多くのAIによる説明は望ましい特性を満たさないことがわかった。さらに、人々の知識が豊富な意思決定タスクでは、特徴貢献説明はAI説明の望ましい特性をより満たすことが示され、一方、人間が意思決定を説明する方法に似ていると考えられる説明（すなわち、反実仮想説明）は、キャリブレーションされた信頼を改善しないようだ。最後に、人間の意思決定をよりよく支援するためのXAI手法の設計を改善するための本研究の含意を議論する。

URL

https://dl.acm.org/doi/10.1145/3397481.3450650

助言における説明はどういう効果があるか？

TITLE

Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making

一言で言うと

助言における説明はどういう効果があるか？

論文リンク

著者/所属機関

Wang, Xinru; Yin, Ming

投稿日付(yyyy/MM/dd)

IUI2021

先行研究と比べてどこがすごい？

助言において説明に用いる手法は人間の認知にどのような影響を与えるかを調査。

人間が機械の説明に何を望んでいるかを３つに分けた

Understanding: Explanation of an AI should improve peple's understanding of it.
Uncertainty awreness: Explanations of an AI should help people recognize the uncertainty underlying an AI prediction and nudge people to rely on the model more on high confidence predictions when the model's confidence is calibrated.
Trust calibration: Explanations of an AI should empower people predictions when the model's confidence is calibrated.

技術・手法のキモはどこ？

どうやって有効だと検証した?

compassデータセットと，森林をどれくらい覆うかのタスクを用意

AIの説明として，a)特徴量重要度, b) 特徴量貢献度, c) 近傍サンプル, d) 反実仮想のサンプルの4パターンを用いる

対象タスクにどれくらい知識を持っているかによって求められるものが異なる。

次はなに読む？

ドメイン別の話をみたい。
他にも信頼の話もある気がする
counterfactual:
- 12: [[counterfactuals-in-explainable-artificial-intelligence-xai-evidence-from-human-reasoning|Counterfactuals in Explainable Artificial Intelligence (XAI): Evidence from Human Reasoning]]
- 54: [[explanation-in-artificial-intelligence-insights-from-the-social-sciences|Explanation in artificial intelligence: Insights from the social sciences]]
XAI:
- 14: [[feature-based-explanations-dont-help-people-detect-misclassifications-of-online-toxicity|Feature-Based Explanations Don't Help People Detect Misclassifications of Online Toxicity]]
- 16: [[explaining-decision-making-algorithms-through-ui-strategies-to-help-non-expert-stakeholders|Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders]]
- 41:
- 69: [[how-do-visual-explanations-foster-end-users-appropriate-trust-in-machine-learning|How do visual explanations foster end users' appropriate trust in machine learning?]]
- 17: [[are-visual-explanations-useful-a-case-study-in-model-in-the-loop-prediction|Are visual explanations useful? a case study in model-in-the-loop prediction]]
AI decision:
- 45: [[investigating-intelligibility-for-uncertain-context-aware-applications|Investigating intelligibility for uncertain context-aware applications]]
- 46: [[why-and-why-not-explanations-improve-the-intelligibility-of-context-aware-intelligent-systems|Why and why not explanations improve the intelligibility of context-aware intelligent systems]]
model explanation:
- 24: [[towards-a-rigorous-science-of-interpretable-machine-learning|Towards A Rigorous Science of Interpretable Machine Learning]]
- 59: [[manipulating-and-measuring-model-interpretability|Manipulating and Measuring Model Interpretability]]
Understanding:
- foo
  - 18: [[instance-level-explanations-for-fraud-detection-a-case-study|Instance-Level Explanations for Fraud Detection: A Case Study]]
  - 33: [[gamut-a-design-probe-to-understand-how-data-scientists-understand-machine-learning-models|Gamut: A design probe to understand how data scientists understand machine learning models]]
- bar
  - 16:
  - 18:
  - 29: [[peeking-inside-the-black-box-visualizing-statistical-learning-with-plots-of-individual-conditional-expectation|Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation]]
- foo
  - 16
  - 24:
  - 40:
  - 47: [[the-mythos-of-model-interpretability-in-machine-learning-the-concept-of-interpretability-is-both-important-and-slippery|The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery]]
  - 59:
- bar
  - 7: [[its-reducing-a-human-being-to-a-percentage-perceptions-of-justice-in-algorithmic-decisions|'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions]]
  - 16:
  - 24:
  - 33:
  - 54
Uncertainty:
- 71: [[effect-of-confidence-and-explanation-on-accuracy-and-trust-calibration-in-ai-assisted-decision-making|Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making]]
Trust calibration:
- 5: [[beyond-accuracy-the-role-of-mental-models-in-human-ai-team-performance|Beyond accuracy: The role of mental models in human-AI team performance]]
- 6: [[
- 11: [[proxy-tasks-and-subjective-measures-can-be-misleading-in-evaluating-explainable-ai-systems|Proxy tasks and subjective measures can be misleading in evaluating explainable ai systems]]
- 69
- 71
Social justice
- 8
- 19
- 38

ドメイン知識をもっているタスクでは

人々が専門知識を持たないタスクでは，望ましい特性がいずれも満たさないことが示された．

人間の専門知識が高いタスクでは，特徴量重要度の説明が，AIの説明よりも好まれることが示された．

反実仮想のような説明では，信頼は改善しないことがわかった．

人々がドメイン知識を持たないタスクでは，4つのAIの説明のどれもが3つの望みを満たさないことがわかった．

人々がドメイン知識を持つタスク（再犯リスクの測定）では，特徴量寄与率が一番好まれていることがわかった．

望み:

Understanding (理解)
Uncertainty Awareness (不確実性の認識)
Trust Caliblation (信頼度)

さまざまなXAI手法が人間の意思決定を支援する際に，専門知識が乏しいタスクに対して有効でないことが明らかになったので，そこについて議論する．

専門知識がないときに，説明に含まれる情報はむしろ精神的に消費するため有効ではない．説明を理解するために処理能力が低下するのではないのか？
専門知識があるときは，自分の知識と説明を比較して考えることができるが，知識がないとこの比較ができない．

このことから，説明を付与する際は，人間がもっている知識に基づいて良い説明方法を提案する方がいい．また，ユーザの目的にあわした説明方法を提供する必要もある．モデルに対する理解度を増やすのが目的なのか，ユーザの意思決定を補強するのが目的なのかでどういうふうに提供するかが変わることがかのう．

モデルによる説明の効果は、人々が様々なレベルの専門知識を持つタスクにおいて劇的に異なる。特筆すべきは、人々が精通していない意思決定タスクでは、確立されたAIによる説明のほとんどは、3つの望ましい条件のいずれも満たさないということである。特徴量重要度の説明は，AIモデルに対する客観的理解を増加させ，特徴量重要度はAIモデルに対する主観的理解を増加させる

私たちが検討した4種類の説明のうち、特徴貢献説明は、人々が意思決定タスクにおいてある程度の領域専門知識を持っている場合に、AI説明のより多くの望みを満たすことができるようである。
本研究の2つの事例に基づく説明については、信頼キャリブレーションをサポートする能力に関する最小限の証拠が見つかった。特に、人間が意思決定を説明する方法に酷似しているとされる反実仮想的説明については、確かにAIモデルに対する人々の理解を深めるのに役立つが、それは人々がある程度の専門知識を持つタスクにおいてのみである。しかし、反実仮想的説明によってもたらされたモデル理解の向上は、人々のモデルに対する信頼の較正には役立たない。

mei28 / TSUNDOKU