rushrukh / awesome-explainable-ai

A repository for summaries of recent explainable AI/Interpretable ML approaches
61 stars 11 forks source link

Recent Publications in Explainable AI

A repository containing recent explainable AI/Interpretable ML approaches

2015

Title Venue Year Code Keywords Summary
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission KDD 2015 N/A ``
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model arXiv 2015 N/A ``

2016

Title Venue Year Code Keywords Summary
Interpretable Decision Sets: A Joint Framework for Description and Prediction KDD 2016 N/A ``
"Why Should I Trust You?": Explaining the Predictions of Any Classifier KDD 2016 N/A ``
Towards A Rigorous Science of Interpretable Machine Learning arXiv 2017 N/A Review Paper

2017

Title Venue Year Code Keywords Summary
Transparency: Motivations and Challenges arXiv 2017 N/A Review Paper
A Unified Approach to Interpreting Model Predictions NeurIPS 2017 N/A ``
SmoothGrad: removing noise by adding noise ICML (Workshop) 2017 Github ``
Axiomatic Attribution for Deep Networks ICML 2017 N/A ``
Learning Important Features Through Propagating Activation Differences ICML 2017 N/A ``
Understanding Black-box Predictions via Influence Functions ICML 2017 N/A ``
Network Dissection: Quantifying Interpretability of Deep Visual Representations CVPR 2017 N/A ``

2018

Title Venue Year Code Keywords Summary
Explainable Prediction of Medical Codes from Clinical Text ACL 2018 N/A ``
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) ICML 2018 N/A ``
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR HJTL 2018 N/A ``
Sanity Checks for Saliency Maps NeruIPS 2018 N/A ``
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions AAAI 2018 N/A ``
The Mythos of Model Interpretability arXiv 2018 N/A Review Paper
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Nature Machine Intelligence 2018 N/A ``

2019

Title Venue Year Code Keywords Summary
Human Evaluation of Models Built for Interpretability AAAI 2019 N/A Human in the loop
Data Shapley: Equitable Valuation of Data for Machine Learning ICML 2019 N/A ``
Attention is not Explanation ACL 2019 N/A ``
Actionable Recourse in Linear Classification FAccT 2019 N/A ``
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead Nature 2019 N/A ``
Explanations can be manipulated and geometry is to blame NeurIPS 2019 N/A ``
Learning Optimized Risk Scores JMLR 2019 N/A ``
Explain Yourself! Leveraging Language Models for Commonsense Reasoning ACL 2019 N/A ``
Deep Neural Networks Constrained by Decision Rules AAAI 2018 N/A ``
Towards Automatic Concept-based Explanations NeurIPS 2019 Github ``

2020

Title Venue Year Code Keywords Summary
Interpreting the Latent Space of GANs for Semantic Face Editing CVPR 2020 N/A ``
GANSpace: Discovering Interpretable GAN Controls NeurIPS 2020 N/A ``
Explainability for fair machine learning arXiv 2020 N/A ``
An Introduction to Circuits Distill 2020 N/A Tutorial
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses NeurIPS 2020 N/A ``
Learning Model-Agnostic Counterfactual Explanations for Tabular Data WWW 2020 N/A ``
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods AIES (AAAI) 2020 N/A ``
Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning CHI 2020 N/A Review Paper
Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs arXiv 2020 N/A Review Paper
Human-Driven FOL Explanations of Deep Learning IJCAI 2020 N\A 'Logic Explanations'
A Constraint-Based Approach to Learning and Explanation AAAI 2020 N\A 'Mutual Information'

2021

Title Venue Year Code Keywords Summary
A Learning Theoretic Perspective on Local Explainability ICLR (Poster) 2021 N/A ``
A Learning Theoretic Perspective on Local Explainability ICLR 2021 N/A ``
Do Input Gradients Highlight Discriminative Features? NeurIPS 2021 N/A ``
Explaining by Removing: A Unified Framework for Model Explanation JMLR 2021 N/A ``
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience PACMHCI 2021 N/A ``
Towards Robust and Reliable Algorithmic Recourse NeurIPS 2021 N/A ``
A Framework to Learn with Interpretation NeurIPS 2021 N/A ``
Algorithmic Recourse: from Counterfactual Explanations to Interventions FAccT 2021 N/A ``
Manipulating and Measuring Model Interpretability CHI 2021 N/A ``
Explainable Reinforcement Learning via Model Transforms NeurIPS 2021 N/A ``
Aligning Artificial Neural Networks and Ontologies towards Explainable AI AAAI 2021 N/A ``

2022

Title Venue Year Code Keywords Summary
GlanceNets: Interpretabile, Leak-proof Concept-based Models CRL 2022 N/A ``
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases Transformer Circuit Thread 2022 N/A Tutorial
Can language models learn from explanations in context? EMNLP 2022 N/A DeepMind
Interpreting Language Models with Contrastive Explanations EMNLP 2022 N/A ``
Acquisition of Chess Knowledge in AlphaZero PNAS 2022 N/A DeepMind GoogleBrain
What the DAAM: Interpreting Stable Diffusion Using Cross Attention arXiv 2022 Github ``
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis AISTATS 2022 N/A ``
Use-Case-Grounded Simulations for Explanation Evaluation NeurIPS 2022 N/A ``
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective arXiv 2022 N/A ``
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations arXiv 2022 N/A ``
NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights AAAI 2022 Github ``
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations AIES (AAAI) 2022 N/A ``
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models arXiv 2022 Github ``
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off NuerIPS 2022 Github CBM, CEM
Self-explaining deep models with logic rule reasoning NeurIPS 2022 N/A ``
What You See is What You Classify: Black Box Attributions NeurIPS 2022 N/A ``
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations NeurIPS 2022 N/A ``
What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods NeurIPS 2022 N/A ``
Scalable Interpretability via Polynomials NeurIPS 2022 N/A ``
Learning to Scaffold: Optimizing Model Explanations for Teaching NeurIPS 2022 N/A ``
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF NeurIPS 2022 N/A ``
WeightedSHAP: analyzing and improving Shapley based feature attribution NeurIPS 2022 N/A ``
Visual correspondence-based explanations improve AI robustness and human-AI team accuracy NeurIPS 2022 N/A ``
VICE: Variational Interpretable Concept Embeddings NeurIPS 2022 N/A ``
Robust Feature-Level Adversaries are Interpretability Tools NeurIPS 2022 N/A ``
ProtoX: Explaining a Reinforcement Learning Agent via Prototyping NeurIPS 2022 N/A ``
ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model NeurIPS 2022 N/A ``
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability NeurIPS 2022 N/A ``
Neural Basis Models for Interpretability NeurIPS 2022 N/A ``
Implications of Model Indeterminacy for Explanations of Automated Decisions NeurIPS 2022 N/A ``
Explainability Via Causal Self-Talk NeurIPS 2022 N/A DeepMind
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations NeurIPS 2022 N/A ``
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models NeurIPS 2022 N/A GoogleBrain
OpenXAI: Towards a Transparent Evaluation of Model Explanations NeurIPS 2022 N/A ``
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations NeurIPS 2022 N/A ``
Foundations of Symbolic Languages for Model Interpretability NeurIPS 2022 N/A ``
The Utility of Explainable AI in Ad Hoc Human-Machine Teaming NeurIPS 2022 N/A ``
Addressing Leakage in Concept Bottleneck Models NeurIPS 2022 N/A ``
Interpreting Language Models with Contrastive Explanations EMNLP 2022 N/A ``
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models EMNLP 2022 N/A ``
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations EMNLP 2022 N/A ``
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure EMNLP 2022 N/A ``
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework EMNLP 2022 N/A ``
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning EMNLP 2022 N/A ``
Faithful Knowledge Graph Explanations in Commonsense Question Answering EMNLP 2022 N/A ``
Optimal Interpretable Clustering Using Oblique Decision Trees KDD 2022 N/A ``
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis KDD 2022 N/A ``
Learning Differential Operators for Interpretable Time Series Modeling KDD 2022 N/A ``
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network KDD 2022 N/A ``
Causal Attention for Interpretable and Generalizable Graph Classification KDD 2022 N/A ``
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction KDD 2022 N/A ``
Label-Free Explainability for Unsupervised Models ICML 2022 N/A ``
Rethinking Attention-Model Explainability through Faithfulness Violation Test ICML 2022 N/A ``
Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods ICML 2022 N/A ``
A Functional Information Perspective on Model Interpretation ICML 2022 N/A ``
Inducing Causal Structure for Interpretable Neural Networks ICML 2022 N/A ``
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder ICML 2022 N/A ``
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings ICML 2022 N/A ``
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism ICML 2022 N/A ``
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers ICML 2022 N/A ``
Robust Models Are More Interpretable Because Attributions Look Normal ICML 2022 N/A ``
Latent Diffusion Energy-Based Model for Interpretable Text Modelling ICML 2022 N/A ``
Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation based COVID-19 Misinformation Detection IJCAI 2022 N/A ``
AttExplainer: Explain Transformer via Attention by Reinforcement Learning IJCAI 2022 N/A ``
Investigating and explaining the frequency bias in classification IJCAI 2022 N/A ``
Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN IJCAI 2022 N/A ``
Axiomatic Foundations of Explainability IJCAI 2022 N/A ``
Explaining Soft-Goal Conflicts through Constraint Relaxations IJCAI 2022 N/A ``
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation IJCAI 2022 N/A ``
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering IJCAI 2022 N/A ``
Toward Policy Explanations for Multi-Agent Reinforcement Learning IJCAI 2022 N/A ``
“My nose is running.” “Are you also coughing?”: Building A Medical Diagnosis Agent with Interpretable Inquiry Logics IJCAI 2022 N/A ``
Model Stealing Defense against Exploiting Information Leak Through the Interpretation of Deep Neural Nets IJCAI 2022 N/A ``
Learning by Interpreting IJCAI 2022 N/A ``
Using Constraint Programming and Graph Representation Learning for Generating Interpretable Cloud Security Policies IJCAI 2022 N/A ``
Explanations for Negative Query Answers under Inconsistency-Tolerant Semantics IJCAI 2022 N/A ``
On Preferred Abductive Explanations for Decision Trees and Random Forests IJCAI 2022 N/A ``
Adversarial Explanations for Knowledge Graph Embeddings IJCAI 2022 N/A ``
Looking Inside the Black-Box: Logic-based Explanations for Neural Networks KR 2022 N/A ``
Entropy-Based Logic Explanations of Neural Networks AAAI 2022 N/A ``
Explainable Neural Rule Learning WWW 2022 N/A ``
Explainable Deep Learning: A Field Guide for the Uninitiated JAIR 2022 N/A ``
[]() N/A ``

2023

Title Venue Year Code Keywords Summary
On the Privacy Risks of Algorithmic Recourse AISTATS 2023 N/A ``
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten ICML 2023 N/A ``
Tracr: Compiled Transformers as a Laboratory for Interpretability arXiv 2023 Github DeepMind
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse ICLR 2023 N/A ``
Concept-level Debugging of Part-Prototype Networks ICLR 2023 N/A ``
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning ICLR 2023 N/A ``
Re-calibrating Feature Attributions for Model Interpretation ICLR 2023 N/A ``
Post-hoc Concept Bottleneck Models ICLR 2023 N/A ``
Quantifying Memorization Across Neural Language Models ICLR 2023 N/A ``
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark ICLR 2023 N/A ``
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification CVPR 2023 N/A ``
EVAL: Explainable Video Anomaly Localization CVPR 2023 N/A ``
Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability CVPR 2023 Github ``
Spatial-Temporal Concept Based Explanation of 3D ConvNets CVPR 2023 Github ``
Adversarial Counterfactual Visual Explanations CVPR 2023 N/A ``
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification CVPR 2023 N/A ``
Explaining Image Classifiers With Multiscale Directional Image Representation CVPR 2023 N/A ``
CRAFT: Concept Recursive Activation FacTorization for Explainability CVPR 2023 N/A ``
SketchXAI: A First Look at Explainability for Human Sketches CVPR 2023 N/A ``
Don't Lie to Me! Robust and Efficient Explainability With Verified Perturbation Analysis CVPR 2023 N/A ``
Gradient-Based Uncertainty Attribution for Explainable Bayesian Deep Learning CVPR 2023 N/A ``
Learning Bottleneck Concepts in Image Classification CVPR 2023 N/A ``
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification CVPR 2023 N/A ``
Interpretable Neural-Symbolic Concept Reasoning ICML 2023 Github
Identifying Interpretable Subspaces in Image Representations ICML 2023 N/A ``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat ICML 2023 N/A ``
Explainability as statistical inference ICML 2023 N/A ``
On the Impact of Knowledge Distillation for Model Interpretability ICML 2023 N/A ``
NA2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning ICML 2023 N/A ``
Explaining Reinforcement Learning with Shapley Values ICML 2023 N/A ``
Explainable Data-Driven Optimization: From Context to Decision and Back Again ICML 2023 N/A ``
Causal Proxy Models for Concept-based Model Explanations ICML 2023 N/A ``
Learning Perturbations to Explain Time Series Predictions ICML 2023 N/A ``
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching ICML 2023 N/A ``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat ICML 2023 Github ``
Representer Point Selection for Explaining Regularized High-dimensional Models ICML 2023 N/A ``
Towards Explaining Distribution Shifts ICML 2023 N/A ``
Relevant Walk Search for Explaining Graph Neural Networks ICML 2023 Github ``
Concept-based Explanations for Out-of-Distribution Detectors ICML 2023 N/A ``
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations ICML 2023 Github ``
Robust Explanation for Free or At the Cost of Faithfulness ICML 2023 N/A ``
Learn to Accumulate Evidence from All Training Samples: Theory and Practice ICML 2023 N/A ``
Towards Trustworthy Explanation: On Causal Rationalization ICML 2023 N/A ``
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables ICML 2023 N/A ``
Probabilistic Concept Bottleneck Models ICML 2023 N/A ``
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective ICML 2023 N/A ``
Towards credible visual model interpretation with path attribution ICML 2023 N/A ``
Trainability, Expressivity and Interpretability in Gated Neural ODEs ICML 2023 N/A ``
Discover and Cure: Concept-aware Mitigation of Spurious Correlation ICML 2023 N/A ``
PWSHAP: A Path-Wise Explanation Model for Targeted Variables ICML 2023 N/A ``
A Closer Look at the Intervention Procedure of Concept Bottleneck Models ICML 2023 N/A ``
Counterfactual Analysis in Dynamic Latent-State Models ICML 2023 N/A ``
Tackling Shortcut Learning in Deep Neural Networks: An Iterative Approach with Interpretable Models ICML Workshop 2023 N/A ``
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers AAAI 2023 N/A ``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability AAAI 2023 N/A ``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations AAAI 2023 N/A ``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework AAAI 2023 N/A ``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling AAAI 2023 N/A ``
Learning Interpretable Temporal Properties from Positive Examples Only AAAI 2023 N/A ``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions AAAI 2023 N/A ``
Towards More Robust Interpretation via Local Gradient Alignment AAAI 2023 N/A ``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network AAAI 2023 N/A ``
XClusters: Explainability-First Clustering AAAI 2023 N/A ``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis AAAI 2023 N/A ``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations AAAI 2023 N/A ``
Explaining Model Confidence Using Counterfactuals AAAI 2023 N/A ``
SEAT: Stable and Explainable Attention AAAI 2023 N/A ``
Factual and Informative Review Generation for Explainable Recommendation AAAI 2023 N/A ``
Improving Interpretability via Explicit Word Interaction Graph Layer AAAI 2023 N/A ``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing AAAI 2023 N/A ``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations AAAI 2023 N/A ``
Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe AAAI 2023 N/A ``
eForecaster: Unifying Electricity Forecasting with Robust, Flexible, and Explainable Machine Learning Algorithms AAAI 2023 N/A ``
SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence AAAI 2023 N/A ``
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency AAAI 2023 N/A ``
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education AAAI 2023 N/A ``
Semantics, Ontology and Explanation arXiv 2023 N/A Ontological Unpacking
Post Hoc Explanations of Language Models Can Improve Language Models arXiv 2023 N/A ``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching AAAI 2023 N/A ``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework AAAI 2023 N/A ``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
Multi-Aspect Explainable Inductive Relation Prediction by Sentence Transformer AAAI 2023 N/A ``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling AAAI 2023 N/A ``
Learning Interpretable Temporal Properties from Positive Examples Only AAAI 2023 N/A ``
Unfooling Perturbation-Based Post Hoc Explainers AAAI 2023 N/A ``
Very Fast, Approximate Counterfactual Explanations for Decision Forests AAAI 2023 N/A ``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions AAAI 2023 N/A ``
Towards More Robust Interpretation via Local Gradient Alignment AAAI 2023 N/A ``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network AAAI 2023 N/A ``
Local Explanations for Reinforcement Learning AAAI 2023 N/A ``
ConceptX: A Framework for Latent Concept Analysis AAAI 2023 N/A ``
XClusters: Explainability-First Clustering AAAI 2023 N/A ``
Explaining Random Forests Using Bipolar Argumentation and Markov Networks AAAI 2023 N/A ``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis AAAI 2023 N/A ``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations AAAI 2023 N/A ``
Explaining Model Confidence Using Counterfactuals AAAI 2023 N/A ``
XRand: Differentially Private Defense against Explanation-Guided Attacks AAAI 2023 N/A ``
Unsupervised Explanation Generation via Correct Instantiations AAAI 2023 N/A ``
SEAT: Stable and Explainable Attention AAAI 2023 N/A ``
Disentangled CVAEs with Contrastive Learning for Explainable Recommendation AAAI 2023 N/A ``
Factual and Informative Review Generation for Explainable Recommendation AAAI 2023 N/A ``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing AAAI 2023 N/A ``
Improving Interpretability via Explicit Word Interaction Graph Layer AAAI 2023 N/A ``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations AAAI 2023 N/A ``
Interpretable Chirality-Aware Graph Neural Network for Quantitative Structure Activity Relationship Modeling in Drug Discovery AAAI 2023 N/A ``
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap AAAI 2023 N/A ``
Interactive Concept Bottleneck Models AAAI 2023 N/A ``
Data-Efficient and Interpretable Tabular Anomaly Detection KDD 2023 N/A ``
Counterfactual Learning on Heterogeneous Graphs with Greedy Perturbation KDD 2023 N/A ``
Hands-on Tutorial: "Explanations in AI: Methods, Stakeholders and Pitfalls" KDD 2023 N/A ``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations KDD 2023 N/A ``
Generative AI meets Responsible AI: Practical Challenges and Opportunities KDD 2023 N/A ``
Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective KDD 2023 N/A ``
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation KDD 2023 N/A ``
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations KDD 2023 N/A ``
Fire: An Optimization Approach for Fast Interpretable Rule Extraction KDD 2023 N/A ``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation KDD 2023 N/A ``
A Causality Inspired Framework for Model Interpretation KDD 2023 N/A ``
Path-Specific Counterfactual Fairness for Recommender Systems KDD 2023 N/A ``
SURE: Robust, Explainable, and Fair Classification without Sensitive Attributes KDD 2023 N/A ``
Learning for Counterfactual Fairness from Observational Data KDD 2023 N/A ``
Interpretable Sparsification of Brain Graphs: Better Practices and Effective Designs for Graph Neural Networks KDD 2023 N/A ``
ExplainableFold: Understanding AlphaFold Prediction with Explainable AI KDD 2023 N/A ``
FLAMES2Graph: An Interpretable Federated Multivariate Time Series Classification Framework KDD 2023 N/A ``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations KDD 2023 N/A ``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation KDD 2023 N/A ``
Counterfactual Explanations and Model Multiplicity: a Relational Verification View Proceedings of KR 2023 N/A ``
Explainable Representations for Relation Prediction in Knowledge Graphs Proceedings of KR 2023 N/A ``
Region-based Saliency Explanations on the Recognition of Facial Genetic Syndromes PMLR 2023 N/A ``
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods arXiv 2023 N/A ``
Diffusion-based Visual Counterfactual Explanations - Towards Systematic Quantitative Evaluation arXiv 2023 N/A ``
Testing methods of neural systems understanding Cognitive Systems Research 2023 N/A ``
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning arXiv 2023 N/A ``
An Explainable Federated Learning and Blockchain based Secure Credit Modeling Method EJOR 2023 N/A ``
i-Align: an interpretable knowledge graph alignment model DMKD 2023 N/A ``
Goodhart’s Law Applies to NLP’s Explanation Benchmarks arXiv 2023 N/A ``
DELELSTM: DECOMPOSITION-BASED LINEAR EXPLAINABLE LSTM TO CAPTURE INSTANTANEOUS AND LONG-TERM EFFECTS IN TIME SERIES arXiv 2023 N/A ``
BEYOND DISCRIMINATIVE REGIONS: SALIENCY MAPS AS ALTERNATIVES TO CAMS FOR WEAKLY SU- PERVISED SEMANTIC SEGMENTATION arXiv 2023 N/A ``
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks arXiv 2023 N/A ``
Sparse Linear Concept Discovery Models arXiv 2023 N/A ``
Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI) arXiv 2023 N/A ``
KGTN: Knowledge Graph Transformer Network for explainable multi-category item recommendation KBS 2023 N/A ``
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems arXiv 2023 N/A ``
Explainable Multi-Agent Reinforcement Learning for Temporal Queries IJCAI 2023 N/A ``
Advancing Post-Hoc Case-Based Explanation with Feature Highlighting IJCAI 2023 N/A ``
Explanation-Guided Reward Alignment IJCAI 2023 N/A ``
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts IJCAI 2023 N/A ``
Statistically Significant Concept-based Explanation of Image Classifiers via Model Knockoffs IJCAI 2023 N/A ``
Learning Prototype Classifiers for Long-Tailed Recognition IJCAI 2023 N/A ``
On Translations between ML Models for XAI Purposes IJCAI 2023 N/A ``
The Parameterized Complexity of Finding Concise Local Explanations IJCAI 2023 N/A ``
Neuro-Symbolic Class Expression Learning IJCAI 2023 N/A ``
A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering IJCAI 2023 N/A ``
Cardinality-Minimal Explanations for Monotonic Neural Networks IJCAI 2023 N/A ``
Unveiling Concepts Learned by a World-Class Chess-Playing Agent IJCAI 2023 N/A ``
Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation IJCAI 2023 N/A ``
On the Complexity of Counterfactual Reasoning IJCAI 2023 N/A ``
Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality (Extended Abstract) IJCAI 2023 N/A ``
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing arXiv 2023 N/A ``
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse arXiv 2023 N/A ``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations CIKM 2023 N/A ``
A Function Interpretation Benchmark for Evaluating Interpretability Methods arXiv 2023 N/A ``
Explaining through Transformer Input Sampling arXiv 2023 N/A ``
Backtracking Counterfactuals CLeaR 2023 N/A ``
Text2Concept: Concept Activation Vectors Directly from Text CVPR Workshop 2023 N/A ``
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation arXiv 2023 N/A ``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance NeurIPS 2023 Github ``
CLIP-DISSECT: AUTOMATIC DESCRIPTION OF NEU- RON REPRESENTATIONS IN DEEP VISION NETWORKS ICLR 2023 Github ``
Label-free Concept Bottleneck Models ICLR 2023 N/A ``
Concept-level Debugging of Part-Prototype Networks ICLR 2023 N/A ``
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes ICLR 2023 N/A ``
Re-calibrating Feature Attributions for Model Interpretation ICLR 2023 N/A ``
Post-hoc Concept Bottleneck Models ICLR 2023 N/A ``
Information Maximization Perspective of Orthogonal Matching Pursuit with Applications to Explainable AI NeurIPS 2023 N/A ``
Explaining Predictive Uncertainty with Information Theoretic Shapley Values NeurIPS 2023 N/A ``
REASONER: An Explainable Recommendation Dataset with Comprehensive Labeling Ground Truths NeurIPS 2023 N/A ``
Explain Any Concept: Segment Anything Meets Concept-Based Explanation NeurIPS 2023 N/A ``
VeriX: Towards Verified Explainability of Deep Neural Networks NeurIPS 2023 N/A ``
Explainable and Efficient Randomized Voting Rules NeurIPS 2023 N/A ``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery NeurIPS 2023 N/A ``
Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models NeurIPS 2023 N/A ``
V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs NeurIPS 2023 N/A ``
Explainable Brain Age Prediction using coVariance Neural Networks NeurIPS 2023 N/A ``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery NeurIPS 2023 N/A ``
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion NeurIPS 2023 N/A ``
StateMask: Explaining Deep Reinforcement Learning through State Mask NeurIPS 2023 N/A ``
LICO: Explainable Models with Language-Image COnsistency NeurIPS 2023 N/A ``
On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective NeurIPS 2023 N/A ``
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction NeurIPS 2023 N/A ``
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability NeurIPS 2023 N/A ``
Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks NeurIPS 2023 N/A ``
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples NeurIPS 2023 N/A ``
HiBug: On Human-Interpretable Model Debug NeurIPS 2023 N/A ``
Towards Self-Interpretable Graph-Level Anomaly Detection NeurIPS 2023 N/A ``
Interpretable Graph Networks Formulate Universal Algebra Conjectures NeurIPS 2023 N/A ``
Towards Automated Circuit Discovery for Mechanistic Interpretabilit NeurIPS 2023 N/A ``
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach NeurIPS 2023 N/A ``
DISCOVER: Making Vision Networks Interpretable via Competition and Dissection NeurIPS 2023 N/A ``
MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks NeurIPS 2023 N/A ``
Causal Interpretation of Self-Attention in Pre-Trained Transformers NeurIPS 2023 N/A ``
Tracr: Compiled Transformers as a Laboratory for Interpretability NeurIPS 2023 N/A ``
Learning Interpretable Low-dimensional Representation via Physical Symmetry NeurIPS 2023 N/A ``
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models NeurIPS 2023 N/A ``
Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships NeurIPS 2023 N/A ``
GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints NeurIPS 2023 N/A ``
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction NeurIPS 2023 N/A ``
GPEX, A Framework For Interpreting Artificial Neural Networks NeurIPS 2023 N/A ``
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers NeurIPS 2023 N/A ``
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP NeurIPS 2023 N/A ``
On the Identifiability and Interpretability of Gaussian Process Models NeurIPS 2023 N/A ``
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis NeurIPS 2023 N/A ``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance NeurIPS 2023 N/A ``
Evaluating Neuron Interpretation Methods of NLP Models NeurIPS 2023 N/A ``
FIND: A Function Description Benchmark for Evaluating Interpretability Methods NeurIPS 2023 N/A ``
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model NeurIPS 2023 N/A ``
Interpretable Prototype-based Graph Information Bottleneck NeurIPS 2023 N/A ``
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca NeurIPS 2023 N/A ``
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models NeurIPS 2023 N/A ``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning EMNLP 2023 N/A ``
Towards Explainable and Accessible AI EMNLP 2023 N/A ``
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing EMNLP 2023 N/A ``
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback EMNLP 2023 N/A ``
Goal-Driven Explainable Clustering via Language Descriptions EMNLP 2023 N/A ``
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights EMNLP 2023 N/A ``
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation EMNLP 2023 N/A ``
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision EMNLP 2023 N/A ``
GenEx: A Commonsense-aware Unified Generative Framework for Explainable Cyberbullying Detection EMNLP 2023 N/A ``
DRGCoder: Explainable Clinical Coding for the Early Prediction of Diagnostic-Related Groups EMNLP 2023 N/A ``
LLM4Vis: Explainable Visualization Recommendation using ChatGPT EMNLP 2023 N/A ``
Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting EMNLP 2023 N/A ``
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning EMNLP 2023 N/A ``
Distilling ChatGPT for Explainable Automated Student Answer Assessment EMNLP 2023 N/A ``
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models EMNLP 2023 N/A ``
Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning EMNLP 2023 N/A ``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning EMNLP 2023 N/A ``
Deep Integrated Explanations CIKM 2023 N/A ``
KG4Ex: An Explainable Knowledge Graph-Based Approach for Exercise Recommendation CIKM 2023 N/A ``
Interpretable Fake News Detection with Graph Evidence CIKM 2023 N/A ``
PriSHAP: Prior-guided Shapley Value Explanations for Correlated Features CIKM 2023 N/A ``
A Model-Agnostic Method to Interpret Link Prediction Evaluation of Knowledge Graph Embeddings CIKM 2023 N/A ``
ACGAN-GNNExplainer: Auxiliary Conditional Generative Explainer for Graph Neural Networks CIKM 2023 N/A ``
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries CIKM 2023 N/A ``
Explainable Spatio-Temporal Graph Neural Networks CIKM 2023 N/A ``
Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction CIKM 2023 N/A ``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations CIKM 2023 N/A ``
NOVO: Learnable and Interpretable Document Identifiers for Model-Based IR CIKM 2023 N/A ``
Counterfactual Monotonic Knowledge Tracing for Assessing Students' Dynamic Mastery of Knowledge Concepts CIKM 2023 N/A ``
Contrastive Counterfactual Learning for Causality-aware Interpretable Recommender Systems CIKM 2023 N/A ``
[]() 2023 N/A ``

2024

Title Venue Year Code Keywords Summary
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models AAAI 2024 N/A ``
Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference AAAI 2024 N/A ``
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods AAAI 2024 N/A ``
A Framework for Data-Driven Explainability in Mathematical Optimization AAAI 2024 N/A ``
Q-SENN: Quantized Self-Explaining Neural Networks AAAI 2024 N/A ``
LR-XFL: Logical Reasoning-Based Explainable Federated Learning AAAI 2024 N/A ``
Trade-Offs in Fine-Tuned Diffusion Models between Accuracy and Interpretability AAAI 2024 N/A ``
π-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control AAAI 2024 N/A ``
Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations AAAI 2024 N/A ``
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention AAAI 2024 N/A ``
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack AAAI 2024 N/A ``
Learning Robust Rationales for Model Explainability: A Guidance-Based Approach AAAI 2024 N/A ``
Explaining Generalization Power of a DNN Using Interactive Concepts AAAI 2024 N/A ``
Federated Causality Learning with Explainable Adaptive Optimizatio AAAI 2024 N/A ``
Learning Performance Maximizing Ensembles with Explainability Guarantees AAAI 2024 N/A ``
Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction AAAI 2024 N/A ``
Towards Learning and Explaining Indirect Causal Effects in Neural Networks AAAI 2024 N/A ``
GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations AAAI 2024 N/A ``
Pantypes: Diverse Representatives for Self-Explainable Models AAAI 2024 N/A ``
Factorized Explainer for Graph Neural Networks AAAI 2024 N/A ``
Self-Interpretable Graph Learning with Sufficient and Necessary Explanations AAAI 2024 N/A ``
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning AAAI 2024 N/A ``
A General Theoretical Framework for Learning Smallest Interpretable Models AAAI 2024 N/A ``
Knowledge-Aware Explainable Reciprocal Recommendation AAAI 2024 N/A ``
Fine-Tuning Large Language Model Based Explainable Recommendation with Explainable Quality Reward AAAI 2024 N/A ``
Finding Interpretable Class-Specific Patterns through Efficient Neural Search AAAI 2024 N/A ``
Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing AAAI 2024 N/A ``
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation AAAI 2024 N/A ``
A Convolutional Neural Network Interpretable Framework for Human Ventral Visual Pathway Representation AAAI 2024 N/A ``
NeSyFOLD: A Framework for Interpretable Image Classification AAAI 2024 N/A ``
Knowledge-Aware Neuron Interpretation for Scene Classification AAAI 2024 N/A ``
MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment AAAI 2024 N/A ``
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds AAAI 2024 N/A ``
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction Arxiv 2024 Code ``
[]() 2024 N/A ``
A Brain-Inspired Way of Reducing the Network Complexity via Concept-Regularized Coding for Emotion Recognition AAAI 2024 N/A ``
Visual Chain-of-Thought Prompting for Knowledge-Based Visual Reasoning AAAI 2024 N/A ``
Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning AAAI 2024 N/A ``
PICNN: A Pathway towards Interpretable Convolutional Neural Networks AAAI 2024 N/A ``
MagiCapture: High-Resolution Multi-Concept Portrait Customization AAAI 2024 N/A ``
AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion AAAI 2024 N/A ``
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA AAAI 2024 N/A ``
ViTree: Single-Path Neural Tree for Step-Wise Interpretable Fine-Grained Visual Categorization AAAI 2024 N/A ``
Text-to-Image Generation for Abstract Concepts AAAI 2024 N/A ``
Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-Agnostic Framework Based on Counterfactual Inference AAAI 2024 N/A ``
Set Prediction Guided by Semantic Concepts for Diverse Video Captioning AAAI 2024 N/A ``
Understanding the Role of the Projector in Knowledge Distillation AAAI 2024 N/A ``
Concept-Guided Prompt Learning for Generalization in Vision-Language Models AAAI 2024 N/A ``
Automatic Core-Guided Reformulation via Constraint Explanation and Condition Learning AAAI 2024 N/A ``
Learning to Pivot as a Smart Expert AAAI 2024 N/A ``
Explainable Origin-Destination Crowd Flow Interpolation via Variational Multi-Modal Recurrent Graph Auto-Encoder AAAI 2024 N/A ``
Explaining Reinforcement Learning Agents through Counterfactual Action Outcomes AAAI 2024 N/A ``
Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision AAAI 2024 N/A ``
Unsupervised Object Interaction Learning with Counterfactual Dynamics Models AAAI 2024 N/A ``
NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models CVPR 2024 N/A ``
ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations CVPR 2024 N/A ``
Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding CVPR 2024 N/A ``
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions CVPR 2024 N/A ``
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models CVPR 2024 N/A ``
Link-Context Learning for Multimodal LLMs CVPR 2024 N/A ``
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users CVPR 2024 N/A ``
Learning Structure-from-Motion with Graph Attention Networks CVPR 2024 N/A ``
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding CVPR 2024 N/A ``
Building Optimal Neural Architectures using Interpretable Knowledge CVPR 2024 N/A ``
Understanding Video Transformers via Universal Concept Discovery CVPR 2024 N/A ``
A Unified and Interpretable Emotion Representation and Expression Generation CVPR 2024 N/A ``
Data Poisoning based Backdoor Attacks to Contrastive Learning CVPR 2024 N/A ``
Are Logistic Models Really Interpretable? IJCAI 2024 N/A ``
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability IJCAI 2024 N/A ``
SGDCL: Semantic-Guided Dynamic Correlation Learning for Explainable Autonomous Driving IJCAI 2024 N/A ``
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition IJCAI 2024 N/A ``
Concept-Level Causal Explanation Method for Brain Function Network Classification IJCAI 2024 N/A ``
Capturing Knowledge Graphs and Rules with Octagon Embeddings IJCAI 2024 N/A ``
Constructive Interpolation and Concept-Based Beth Definability for Description Logics via Sequents IJCAI 2024 N/A ``
"NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning" IJCAI 2024 N/A ``
Cutting the Black Box: Conceptual Interpretation of a Deep Neural Net with Multi-Modal Embeddings and Multi-Criteria Decision Aid IJCAI 2024 N/A ``
Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification IJCAI 2024 N/A ``
Learning Causally Disentangled Representations via the Principle of Independent Causal Mechanisms IJCAI 2024 N/A ``
EMOTE: An Explainable Architecture for Modelling the Other through Empathy IJCAI 2024 N/A ``
SEMANTIFY: Unveiling Memes with Robust Interpretability beyond Input Attribution IJCAI 2024 N/A ``
Learning Label Dependencies for Visual Information Extraction IJCAI 2024 N/A ``
Unsupervised Concept Discovery Mitigates Spurious Correlations ICML 2024 N/A ``
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning ICML 2024 N/A ``
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation ICML 2024 N/A ``
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning ICML 2024 N/A ``
Understanding Inter-Concept Relationships in Concept-Based Models ICML 2024 N/A ``
Towards Compositionality in Concept Learning ICML 2024 N/A ``
Learning to Intervene on Concept Bottlenecks ICML 2024 N/A ``
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models ICML 2024 N/A ``
Position: Explain to Question not to Justify ICML 2024 N/A ``
Explaining Graph Neural Networks via Structure-aware Interaction Index ICML 2024 N/A ``
How Interpretable Are Interpretable Graph Neural Networks? ICML 2024 N/A ``
SelfIE: Self-Interpretation of Large Language Model Embeddings ICML 2024 N/A ``
On Mechanistic Knowledge Localization in Text-to-Image Generative Models ICML 2024 N/A ``
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective ICML 2024 N/A ``
How Learning by Reconstruction Produces Uninformative Features For Perception ICML 2024 N/A ``
SelfIE: Self-Interpretation of Large Language Model Embeddings ICML 2024 N/A ``
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks ICML 2024 N/A ``
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations ICML 2024 N/A ``
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation ICML 2024 N/A ``
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments ICML 2024 N/A ``
Explaining Probabilistic Models with Distributional Values ICML 2024 N/A ``
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning ICML 2024 N/A ``
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning ICML 2024 N/A ``
Interpretability Illusions in the Generalization of Simplified Models ICML 2024 N/A ``
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning ICML 2024 N/A ``
Improving Interpretation Faithfulness for Vision Transformers ICML 2024 N/A ``
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks ICML 2024 N/A ``
Understanding the Learning Dynamics of Alignment with Human Feedback ICML 2024 N/A ``
An Information-Theoretic Analysis of In-Context Learning ICML 2024 N/A ``
Learning to Infer Generative Template Programs for Visual Concepts ICML 2024 N/A ``
Learning Decision Trees and Forests with Algorithmic Recourse ICML 2024 N/A ``
From Neurons to Neutrons: A Case Study in Interpretability ICML 2024 N/A ``
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition ICML 2024 N/A ``
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning ICML 2024 N/A ``
Attention Meets Post-hoc Interpretability: A Mathematical Perspective ICML 2024 N/A ``
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations ICML 2024 N/A ``
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks ICML 2024 N/A ``
Learning High-Order Relationships of Brain Regions ICML 2024 N/A ``
Mechanistic Neural Networks for Scientific Machine Learning ICML 2024 N/A ``
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input ICML 2024 N/A ``
Codebook Features: Sparse and Discrete Interpretability for Neural Networks ICML 2024 N/A ``
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features ICML 2024 N/A ``
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions ICML 2024 N/A ``
Explain Temporal Black-Box Models via Functional Decomposition ICML 2024 N/A ``
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts ICML 2024 N/A ``
Learning Causal Dynamics Models in Object-Oriented Environments ICML 2024 N/A ``
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos ICML 2024 N/A ``
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions ICLR 2024 N/A ``
The mechanistic basis of data dependence and abrupt learning in an in-context classification task ICLR 2024 N/A ``
Provable Compositional Generalization for Object-Centric Learning ICLR 2024 N/A ``
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts ICLR 2024 N/A ``
"What Data Benefits My Classifier?" Enhancing Model Performance and Interpretability through Influence-Based Data Selection ICLR 2024 N/A ``
Vision Transformers Need Registers ICLR 2024 N/A ``
Robust agents learn causal world models ICLR 2024 N/A ``
Detecting, Explaining, and Mitigating Memorization in Diffusion Models ICLR 2024 N/A ``
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models ICLR 2024 N/A ``
Interpreting CLIP's Image Representation via Text-Based Decomposition ICLR 2024 N/A ``
Spot Check Equivalence: An Interpretable Metric for Information Elicitation Mechanisms WWW 2024 N/A ``
EXGC: Bridging Efficiency and Explainability in Graph Condensation WWW 2024 N/A ``
Adversarial Mask Explainer for Graph Neural Networks WWW 2024 N/A ``
Globally Interpretable Graph Learning via Distribution Matching WWW 2024 N/A ``
Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models WWW 2024 N/A ``
A Method for Assessing Inference Patterns Captured by Embedding Models in Knowledge Graphs WWW 2024 N/A ``
Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models WWW 2024 N/A ``
NETEVOLVE: Social Network Forecasting using Multi-Agent Reinforcement Learning with Interpretable Features WWW 2024 N/A ``
Invariant Graph Learning for Causal Effect Estimation WWW 2024 N/A ``
Interpretable Knowledge Tracing with Multiscale State Representation WWW 2024 N/A ``
Towards the Identifiability and Explainability for Personalized Learner Modeling: An Inductive Paradigm WWW 2024 N/A ``
A Counterfactual Framework for Learning and Evaluating Explanations for Recommender Systems WWW 2024 N/A ``
Learning Audio Concepts from Counterfactual Natural Language ICASSP 2024 N/A ``
An Explainable Proxy Model for Multilabel Audio Segmentation ICASSP 2024 N/A ``
Learning Ontology Informed Representations with Constraints for Acoustic Event Detection ICASSP 2024 N/A ``
Predict and Interpret Health Risk Using Ehr Through Typical Patients ICASSP 2024 N/A ``
Learning a Convex Patch-Based Synthesis Model via Deep Equilibrium ICASSP 2024 N/A ``
Implicit-Knowledge-Guided Align Before Understanding for KB-VQA ICASSP 2024 N/A ``
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks ICASSP 2024 N/A ``
Improved Image Captioning Via Knowledge Graph-Augmented Models ICASSP 2024 N/A ``
Interpretable Multimodal Out-of-Context Detection with Soft Logic Regularization ICASSP 2024 N/A ``