Recent Publications in Explainable AI

A repository containing recent explainable AI/Interpretable ML approaches

2015

Title	Venue	Year	Code	Keywords	Summary
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission	KDD	2015	N/A	``
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model	arXiv	2015	N/A	``

2016

Title	Venue	Year	Code	Keywords
Interpretable Decision Sets: A Joint Framework for Description and Prediction	KDD	2016	N/A	``
"Why Should I Trust You?": Explaining the Predictions of Any Classifier	KDD	2016	N/A	``
Towards A Rigorous Science of Interpretable Machine Learning	arXiv	2017	N/A	`Review Paper`

2017

Title	Venue	Year	Code	Keywords
Transparency: Motivations and Challenges	arXiv	2017	N/A	`Review Paper`
A Unified Approach to Interpreting Model Predictions	NeurIPS	2017	N/A	``
SmoothGrad: removing noise by adding noise	ICML (Workshop)	2017	Github	``
Axiomatic Attribution for Deep Networks	ICML	2017	N/A	``
Learning Important Features Through Propagating Activation Differences	ICML	2017	N/A	``
Understanding Black-box Predictions via Influence Functions	ICML	2017	N/A	``
Network Dissection: Quantifying Interpretability of Deep Visual Representations	CVPR	2017	N/A	``

2018

Title	Venue	Year	Code	Keywords
Explainable Prediction of Medical Codes from Clinical Text	ACL	2018	N/A	``
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)	ICML	2018	N/A	``
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR	HJTL	2018	N/A	``
Sanity Checks for Saliency Maps	NeruIPS	2018	N/A	``
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions	AAAI	2018	N/A	``
The Mythos of Model Interpretability	arXiv	2018	N/A	`Review Paper`
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead	Nature Machine Intelligence	2018	N/A	``

2019

Title	Venue	Year	Code	Keywords
Human Evaluation of Models Built for Interpretability	AAAI	2019	N/A	`Human in the loop`
Data Shapley: Equitable Valuation of Data for Machine Learning	ICML	2019	N/A	``
Attention is not Explanation	ACL	2019	N/A	``
Actionable Recourse in Linear Classification	FAccT	2019	N/A	``
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead	Nature	2019	N/A	``
Explanations can be manipulated and geometry is to blame	NeurIPS	2019	N/A	``
Learning Optimized Risk Scores	JMLR	2019	N/A	``
Explain Yourself! Leveraging Language Models for Commonsense Reasoning	ACL	2019	N/A	``
Deep Neural Networks Constrained by Decision Rules	AAAI	2018	N/A	``
Towards Automatic Concept-based Explanations	NeurIPS	2019	Github	``

2020

Title	Venue	Year	Code	Keywords
Interpreting the Latent Space of GANs for Semantic Face Editing	CVPR	2020	N/A	``
GANSpace: Discovering Interpretable GAN Controls	NeurIPS	2020	N/A	``
Explainability for fair machine learning	arXiv	2020	N/A	``
An Introduction to Circuits	Distill	2020	N/A	`Tutorial`
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses	NeurIPS	2020	N/A	``
Learning Model-Agnostic Counterfactual Explanations for Tabular Data	WWW	2020	N/A	``
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods	AIES (AAAI)	2020	N/A	``
Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning	CHI	2020	N/A	`Review Paper`
Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs	arXiv	2020	N/A	`Review Paper`
Human-Driven FOL Explanations of Deep Learning	IJCAI	2020	N\A	'Logic Explanations'
A Constraint-Based Approach to Learning and Explanation	AAAI	2020	N\A	'Mutual Information'

2021

Title	Venue	Year	Code	Keywords
A Learning Theoretic Perspective on Local Explainability	ICLR (Poster)	2021	N/A	``
A Learning Theoretic Perspective on Local Explainability	ICLR	2021	N/A	``
Do Input Gradients Highlight Discriminative Features?	NeurIPS	2021	N/A	``
Explaining by Removing: A Unified Framework for Model Explanation	JMLR	2021	N/A	``
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience	PACMHCI	2021	N/A	``
Towards Robust and Reliable Algorithmic Recourse	NeurIPS	2021	N/A	``
A Framework to Learn with Interpretation	NeurIPS	2021	N/A	``
Algorithmic Recourse: from Counterfactual Explanations to Interventions	FAccT	2021	N/A	``
Manipulating and Measuring Model Interpretability	CHI	2021	N/A	``
Explainable Reinforcement Learning via Model Transforms	NeurIPS	2021	N/A	``
Aligning Artificial Neural Networks and Ontologies towards Explainable AI	AAAI	2021	N/A	``

2022

Title	Venue	Year	Code	Keywords
GlanceNets: Interpretabile, Leak-proof Concept-based Models	CRL	2022	N/A	``
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases	Transformer Circuit Thread	2022	N/A	`Tutorial`
Can language models learn from explanations in context?	EMNLP	2022	N/A	`DeepMind`
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Acquisition of Chess Knowledge in AlphaZero	PNAS	2022	N/A	`DeepMind` `GoogleBrain`
What the DAAM: Interpreting Stable Diffusion Using Cross Attention	arXiv	2022	Github	``
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis	AISTATS	2022	N/A	``
Use-Case-Grounded Simulations for Explanation Evaluation	NeurIPS	2022	N/A	``
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective	arXiv	2022	N/A	``
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations	arXiv	2022	N/A	``
NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights	AAAI	2022	Github	``
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations	AIES (AAAI)	2022	N/A	``
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models	arXiv	2022	Github	``
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off	NuerIPS	2022	Github	`CBM`, `CEM`
Self-explaining deep models with logic rule reasoning	NeurIPS	2022	N/A	``
What You See is What You Classify: Black Box Attributions	NeurIPS	2022	N/A	``
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations	NeurIPS	2022	N/A	``
What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods	NeurIPS	2022	N/A	``
Scalable Interpretability via Polynomials	NeurIPS	2022	N/A	``
Learning to Scaffold: Optimizing Model Explanations for Teaching	NeurIPS	2022	N/A	``
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF	NeurIPS	2022	N/A	``
WeightedSHAP: analyzing and improving Shapley based feature attribution	NeurIPS	2022	N/A	``
Visual correspondence-based explanations improve AI robustness and human-AI team accuracy	NeurIPS	2022	N/A	``
VICE: Variational Interpretable Concept Embeddings	NeurIPS	2022	N/A	``
Robust Feature-Level Adversaries are Interpretability Tools	NeurIPS	2022	N/A	``
ProtoX: Explaining a Reinforcement Learning Agent via Prototyping	NeurIPS	2022	N/A	``
ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model	NeurIPS	2022	N/A	``
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability	NeurIPS	2022	N/A	``
Neural Basis Models for Interpretability	NeurIPS	2022	N/A	``
Implications of Model Indeterminacy for Explanations of Automated Decisions	NeurIPS	2022	N/A	``
Explainability Via Causal Self-Talk	NeurIPS	2022	N/A	`DeepMind`
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations	NeurIPS	2022	N/A	``
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	NeurIPS	2022	N/A	`GoogleBrain`
OpenXAI: Towards a Transparent Evaluation of Model Explanations	NeurIPS	2022	N/A	``
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations	NeurIPS	2022	N/A	``
Foundations of Symbolic Languages for Model Interpretability	NeurIPS	2022	N/A	``
The Utility of Explainable AI in Ad Hoc Human-Machine Teaming	NeurIPS	2022	N/A	``
Addressing Leakage in Concept Bottleneck Models	NeurIPS	2022	N/A	``
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models	EMNLP	2022	N/A	``
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations	EMNLP	2022	N/A	``
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure	EMNLP	2022	N/A	``
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework	EMNLP	2022	N/A	``
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning	EMNLP	2022	N/A	``
Faithful Knowledge Graph Explanations in Commonsense Question Answering	EMNLP	2022	N/A	``
Optimal Interpretable Clustering Using Oblique Decision Trees	KDD	2022	N/A	``
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis	KDD	2022	N/A	``
Learning Differential Operators for Interpretable Time Series Modeling	KDD	2022	N/A	``
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network	KDD	2022	N/A	``
Causal Attention for Interpretable and Generalizable Graph Classification	KDD	2022	N/A	``
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction	KDD	2022	N/A	``
Label-Free Explainability for Unsupervised Models	ICML	2022	N/A	``
Rethinking Attention-Model Explainability through Faithfulness Violation Test	ICML	2022	N/A	``
Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods	ICML	2022	N/A	``
A Functional Information Perspective on Model Interpretation	ICML	2022	N/A	``
Inducing Causal Structure for Interpretable Neural Networks	ICML	2022	N/A	``
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder	ICML	2022	N/A	``
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings	ICML	2022	N/A	``
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism	ICML	2022	N/A	``
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers	ICML	2022	N/A	``
Robust Models Are More Interpretable Because Attributions Look Normal	ICML	2022	N/A	``
Latent Diffusion Energy-Based Model for Interpretable Text Modelling	ICML	2022	N/A	``
Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation based COVID-19 Misinformation Detection	IJCAI	2022	N/A	``
AttExplainer: Explain Transformer via Attention by Reinforcement Learning	IJCAI	2022	N/A	``
Investigating and explaining the frequency bias in classification	IJCAI	2022	N/A	``
Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN	IJCAI	2022	N/A	``
Axiomatic Foundations of Explainability	IJCAI	2022	N/A	``
Explaining Soft-Goal Conflicts through Constraint Relaxations	IJCAI	2022	N/A	``
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation	IJCAI	2022	N/A	``
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering	IJCAI	2022	N/A	``
Toward Policy Explanations for Multi-Agent Reinforcement Learning	IJCAI	2022	N/A	``
“My nose is running.” “Are you also coughing?”: Building A Medical Diagnosis Agent with Interpretable Inquiry Logics	IJCAI	2022	N/A	``
Model Stealing Defense against Exploiting Information Leak Through the Interpretation of Deep Neural Nets	IJCAI	2022	N/A	``
Learning by Interpreting	IJCAI	2022	N/A	``
Using Constraint Programming and Graph Representation Learning for Generating Interpretable Cloud Security Policies	IJCAI	2022	N/A	``
Explanations for Negative Query Answers under Inconsistency-Tolerant Semantics	IJCAI	2022	N/A	``
On Preferred Abductive Explanations for Decision Trees and Random Forests	IJCAI	2022	N/A	``
Adversarial Explanations for Knowledge Graph Embeddings	IJCAI	2022	N/A	``
Looking Inside the Black-Box: Logic-based Explanations for Neural Networks	KR	2022	N/A	``
Entropy-Based Logic Explanations of Neural Networks	AAAI	2022	N/A	``
Explainable Neural Rule Learning	WWW	2022	N/A	``
Explainable Deep Learning: A Field Guide for the Uninitiated	JAIR	2022	N/A	``
[]()			N/A	``

2023

Title	Venue	Year	Code	Keywords
On the Privacy Risks of Algorithmic Recourse	AISTATS	2023	N/A	``
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten	ICML	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	arXiv	2023	Github	`DeepMind`
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Quantifying Memorization Across Neural Language Models	ICLR	2023	N/A	``
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark	ICLR	2023	N/A	``
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification	CVPR	2023	N/A	``
EVAL: Explainable Video Anomaly Localization	CVPR	2023	N/A	``
Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability	CVPR	2023	Github	``
Spatial-Temporal Concept Based Explanation of 3D ConvNets	CVPR	2023	Github	``
Adversarial Counterfactual Visual Explanations	CVPR	2023	N/A	``
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification	CVPR	2023	N/A	``
Explaining Image Classifiers With Multiscale Directional Image Representation	CVPR	2023	N/A	``
CRAFT: Concept Recursive Activation FacTorization for Explainability	CVPR	2023	N/A	``
SketchXAI: A First Look at Explainability for Human Sketches	CVPR	2023	N/A	``
Don't Lie to Me! Robust and Efficient Explainability With Verified Perturbation Analysis	CVPR	2023	N/A	``
Gradient-Based Uncertainty Attribution for Explainable Bayesian Deep Learning	CVPR	2023	N/A	``
Learning Bottleneck Concepts in Image Classification	CVPR	2023	N/A	``
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification	CVPR	2023	N/A	``
Interpretable Neural-Symbolic Concept Reasoning	ICML	2023	Github
Identifying Interpretable Subspaces in Image Representations	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	N/A	``
Explainability as statistical inference	ICML	2023	N/A	``
On the Impact of Knowledge Distillation for Model Interpretability	ICML	2023	N/A	``
NA2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning	ICML	2023	N/A	``
Explaining Reinforcement Learning with Shapley Values	ICML	2023	N/A	``
Explainable Data-Driven Optimization: From Context to Decision and Back Again	ICML	2023	N/A	``
Causal Proxy Models for Concept-based Model Explanations	ICML	2023	N/A	``
Learning Perturbations to Explain Time Series Predictions	ICML	2023	N/A	``
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	Github	``
Representer Point Selection for Explaining Regularized High-dimensional Models	ICML	2023	N/A	``
Towards Explaining Distribution Shifts	ICML	2023	N/A	``
Relevant Walk Search for Explaining Graph Neural Networks	ICML	2023	Github	``
Concept-based Explanations for Out-of-Distribution Detectors	ICML	2023	N/A	``
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations	ICML	2023	Github	``
Robust Explanation for Free or At the Cost of Faithfulness	ICML	2023	N/A	``
Learn to Accumulate Evidence from All Training Samples: Theory and Practice	ICML	2023	N/A	``
Towards Trustworthy Explanation: On Causal Rationalization	ICML	2023	N/A	``
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables	ICML	2023	N/A	``
Probabilistic Concept Bottleneck Models	ICML	2023	N/A	``
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective	ICML	2023	N/A	``
Towards credible visual model interpretation with path attribution	ICML	2023	N/A	``
Trainability, Expressivity and Interpretability in Gated Neural ODEs	ICML	2023	N/A	``
Discover and Cure: Concept-aware Mitigation of Spurious Correlation	ICML	2023	N/A	``
PWSHAP: A Path-Wise Explanation Model for Targeted Variables	ICML	2023	N/A	``
A Closer Look at the Intervention Procedure of Concept Bottleneck Models	ICML	2023	N/A	``
Counterfactual Analysis in Dynamic Latent-State Models	ICML	2023	N/A	``
Tackling Shortcut Learning in Deep Neural Networks: An Iterative Approach with Interpretable Models	ICML Workshop	2023	N/A	``
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers	AAAI	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe	AAAI	2023	N/A	``
eForecaster: Unifying Electricity Forecasting with Robust, Flexible, and Explainable Machine Learning Algorithms	AAAI	2023	N/A	``
SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence	AAAI	2023	N/A	``
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency	AAAI	2023	N/A	``
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education	AAAI	2023	N/A	``
Semantics, Ontology and Explanation	arXiv	2023	N/A	`Ontological Unpacking`
Post Hoc Explanations of Language Models Can Improve Language Models	arXiv	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
Multi-Aspect Explainable Inductive Relation Prediction by Sentence Transformer	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Unfooling Perturbation-Based Post Hoc Explainers	AAAI	2023	N/A	``
Very Fast, Approximate Counterfactual Explanations for Decision Forests	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
Local Explanations for Reinforcement Learning	AAAI	2023	N/A	``
ConceptX: A Framework for Latent Concept Analysis	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Explaining Random Forests Using Bipolar Argumentation and Markov Networks	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
XRand: Differentially Private Defense against Explanation-Guided Attacks	AAAI	2023	N/A	``
Unsupervised Explanation Generation via Correct Instantiations	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Disentangled CVAEs with Contrastive Learning for Explainable Recommendation	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Interpretable Chirality-Aware Graph Neural Network for Quantitative Structure Activity Relationship Modeling in Drug Discovery	AAAI	2023	N/A	``
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap	AAAI	2023	N/A	``
Interactive Concept Bottleneck Models	AAAI	2023	N/A	``
Data-Efficient and Interpretable Tabular Anomaly Detection	KDD	2023	N/A	``
Counterfactual Learning on Heterogeneous Graphs with Greedy Perturbation	KDD	2023	N/A	``
Hands-on Tutorial: "Explanations in AI: Methods, Stakeholders and Pitfalls"	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
Generative AI meets Responsible AI: Practical Challenges and Opportunities	KDD	2023	N/A	``
Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective	KDD	2023	N/A	``
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation	KDD	2023	N/A	``
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations	KDD	2023	N/A	``
Fire: An Optimization Approach for Fast Interpretable Rule Extraction	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
A Causality Inspired Framework for Model Interpretation	KDD	2023	N/A	``
Path-Specific Counterfactual Fairness for Recommender Systems	KDD	2023	N/A	``
SURE: Robust, Explainable, and Fair Classification without Sensitive Attributes	KDD	2023	N/A	``
Learning for Counterfactual Fairness from Observational Data	KDD	2023	N/A	``
Interpretable Sparsification of Brain Graphs: Better Practices and Effective Designs for Graph Neural Networks	KDD	2023	N/A	``
ExplainableFold: Understanding AlphaFold Prediction with Explainable AI	KDD	2023	N/A	``
FLAMES2Graph: An Interpretable Federated Multivariate Time Series Classification Framework	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
Counterfactual Explanations and Model Multiplicity: a Relational Verification View	Proceedings of KR	2023	N/A	``
Explainable Representations for Relation Prediction in Knowledge Graphs	Proceedings of KR	2023	N/A	``
Region-based Saliency Explanations on the Recognition of Facial Genetic Syndromes	PMLR	2023	N/A	``
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods	arXiv	2023	N/A	``
Diffusion-based Visual Counterfactual Explanations - Towards Systematic Quantitative Evaluation	arXiv	2023	N/A	``
Testing methods of neural systems understanding	Cognitive Systems Research	2023	N/A	``
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning	arXiv	2023	N/A	``
An Explainable Federated Learning and Blockchain based Secure Credit Modeling Method	EJOR	2023	N/A	``
i-Align: an interpretable knowledge graph alignment model	DMKD	2023	N/A	``
Goodhart’s Law Applies to NLP’s Explanation Benchmarks	arXiv	2023	N/A	``
DELELSTM: DECOMPOSITION-BASED LINEAR EXPLAINABLE LSTM TO CAPTURE INSTANTANEOUS AND LONG-TERM EFFECTS IN TIME SERIES	arXiv	2023	N/A	``
BEYOND DISCRIMINATIVE REGIONS: SALIENCY MAPS AS ALTERNATIVES TO CAMS FOR WEAKLY SU- PERVISED SEMANTIC SEGMENTATION	arXiv	2023	N/A	``
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks	arXiv	2023	N/A	``
Sparse Linear Concept Discovery Models	arXiv	2023	N/A	``
Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI)	arXiv	2023	N/A	``
KGTN: Knowledge Graph Transformer Network for explainable multi-category item recommendation	KBS	2023	N/A	``
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems	arXiv	2023	N/A	``
Explainable Multi-Agent Reinforcement Learning for Temporal Queries	IJCAI	2023	N/A	``
Advancing Post-Hoc Case-Based Explanation with Feature Highlighting	IJCAI	2023	N/A	``
Explanation-Guided Reward Alignment	IJCAI	2023	N/A	``
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts	IJCAI	2023	N/A	``
Statistically Significant Concept-based Explanation of Image Classifiers via Model Knockoffs	IJCAI	2023	N/A	``
Learning Prototype Classifiers for Long-Tailed Recognition	IJCAI	2023	N/A	``
On Translations between ML Models for XAI Purposes	IJCAI	2023	N/A	``
The Parameterized Complexity of Finding Concise Local Explanations	IJCAI	2023	N/A	``
Neuro-Symbolic Class Expression Learning	IJCAI	2023	N/A	``
A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering	IJCAI	2023	N/A	``
Cardinality-Minimal Explanations for Monotonic Neural Networks	IJCAI	2023	N/A	``
Unveiling Concepts Learned by a World-Class Chess-Playing Agent	IJCAI	2023	N/A	``
Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation	IJCAI	2023	N/A	``
On the Complexity of Counterfactual Reasoning	IJCAI	2023	N/A	``
Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality (Extended Abstract)	IJCAI	2023	N/A	``
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing	arXiv	2023	N/A	``
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse	arXiv	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
A Function Interpretation Benchmark for Evaluating Interpretability Methods	arXiv	2023	N/A	``
Explaining through Transformer Input Sampling	arXiv	2023	N/A	``
Backtracking Counterfactuals	CLeaR	2023	N/A	``
Text2Concept: Concept Activation Vectors Directly from Text	CVPR Workshop	2023	N/A	``
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation	arXiv	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	Github	``
CLIP-DISSECT: AUTOMATIC DESCRIPTION OF NEU- RON REPRESENTATIONS IN DEEP VISION NETWORKS	ICLR	2023	Github	``
Label-free Concept Bottleneck Models	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Information Maximization Perspective of Orthogonal Matching Pursuit with Applications to Explainable AI	NeurIPS	2023	N/A	``
Explaining Predictive Uncertainty with Information Theoretic Shapley Values	NeurIPS	2023	N/A	``
REASONER: An Explainable Recommendation Dataset with Comprehensive Labeling Ground Truths	NeurIPS	2023	N/A	``
Explain Any Concept: Segment Anything Meets Concept-Based Explanation	NeurIPS	2023	N/A	``
VeriX: Towards Verified Explainability of Deep Neural Networks	NeurIPS	2023	N/A	``
Explainable and Efficient Randomized Voting Rules	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models	NeurIPS	2023	N/A	``
V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs	NeurIPS	2023	N/A	``
Explainable Brain Age Prediction using coVariance Neural Networks	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion	NeurIPS	2023	N/A	``
StateMask: Explaining Deep Reinforcement Learning through State Mask	NeurIPS	2023	N/A	``
LICO: Explainable Models with Language-Image COnsistency	NeurIPS	2023	N/A	``
On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective	NeurIPS	2023	N/A	``
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction	NeurIPS	2023	N/A	``
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability	NeurIPS	2023	N/A	``
Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks	NeurIPS	2023	N/A	``
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples	NeurIPS	2023	N/A	``
HiBug: On Human-Interpretable Model Debug	NeurIPS	2023	N/A	``
Towards Self-Interpretable Graph-Level Anomaly Detection	NeurIPS	2023	N/A	``
Interpretable Graph Networks Formulate Universal Algebra Conjectures	NeurIPS	2023	N/A	``
Towards Automated Circuit Discovery for Mechanistic Interpretabilit	NeurIPS	2023	N/A	``
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach	NeurIPS	2023	N/A	``
DISCOVER: Making Vision Networks Interpretable via Competition and Dissection	NeurIPS	2023	N/A	``
MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks	NeurIPS	2023	N/A	``
Causal Interpretation of Self-Attention in Pre-Trained Transformers	NeurIPS	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	NeurIPS	2023	N/A	``
Learning Interpretable Low-dimensional Representation via Physical Symmetry	NeurIPS	2023	N/A	``
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models	NeurIPS	2023	N/A	``
Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships	NeurIPS	2023	N/A	``
GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints	NeurIPS	2023	N/A	``
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction	NeurIPS	2023	N/A	``
GPEX, A Framework For Interpreting Artificial Neural Networks	NeurIPS	2023	N/A	``
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers	NeurIPS	2023	N/A	``
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP	NeurIPS	2023	N/A	``
On the Identifiability and Interpretability of Gaussian Process Models	NeurIPS	2023	N/A	``
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis	NeurIPS	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	N/A	``
Evaluating Neuron Interpretation Methods of NLP Models	NeurIPS	2023	N/A	``
FIND: A Function Description Benchmark for Evaluating Interpretability Methods	NeurIPS	2023	N/A	``
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model	NeurIPS	2023	N/A	``
Interpretable Prototype-based Graph Information Bottleneck	NeurIPS	2023	N/A	``
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca	NeurIPS	2023	N/A	``
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models	NeurIPS	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Towards Explainable and Accessible AI	EMNLP	2023	N/A	``
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing	EMNLP	2023	N/A	``
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback	EMNLP	2023	N/A	``
Goal-Driven Explainable Clustering via Language Descriptions	EMNLP	2023	N/A	``
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights	EMNLP	2023	N/A	``
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation	EMNLP	2023	N/A	``
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision	EMNLP	2023	N/A	``
GenEx: A Commonsense-aware Unified Generative Framework for Explainable Cyberbullying Detection	EMNLP	2023	N/A	``
DRGCoder: Explainable Clinical Coding for the Early Prediction of Diagnostic-Related Groups	EMNLP	2023	N/A	``
LLM4Vis: Explainable Visualization Recommendation using ChatGPT	EMNLP	2023	N/A	``
Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting	EMNLP	2023	N/A	``
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning	EMNLP	2023	N/A	``
Distilling ChatGPT for Explainable Automated Student Answer Assessment	EMNLP	2023	N/A	``
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models	EMNLP	2023	N/A	``
Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning	EMNLP	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Deep Integrated Explanations	CIKM	2023	N/A	``
KG4Ex: An Explainable Knowledge Graph-Based Approach for Exercise Recommendation	CIKM	2023	N/A	``
Interpretable Fake News Detection with Graph Evidence	CIKM	2023	N/A	``
PriSHAP: Prior-guided Shapley Value Explanations for Correlated Features	CIKM	2023	N/A	``
A Model-Agnostic Method to Interpret Link Prediction Evaluation of Knowledge Graph Embeddings	CIKM	2023	N/A	``
ACGAN-GNNExplainer: Auxiliary Conditional Generative Explainer for Graph Neural Networks	CIKM	2023	N/A	``
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries	CIKM	2023	N/A	``
Explainable Spatio-Temporal Graph Neural Networks	CIKM	2023	N/A	``
Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction	CIKM	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
NOVO: Learnable and Interpretable Document Identifiers for Model-Based IR	CIKM	2023	N/A	``
Counterfactual Monotonic Knowledge Tracing for Assessing Students' Dynamic Mastery of Knowledge Concepts	CIKM	2023	N/A	``
Contrastive Counterfactual Learning for Causality-aware Interpretable Recommender Systems	CIKM	2023	N/A	``
[]()		2023	N/A	``

2024

Title	Venue	Year	Code	Keywords
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models	AAAI	2024	N/A	``
Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference	AAAI	2024	N/A	``
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods	AAAI	2024	N/A	``
A Framework for Data-Driven Explainability in Mathematical Optimization	AAAI	2024	N/A	``
Q-SENN: Quantized Self-Explaining Neural Networks	AAAI	2024	N/A	``
LR-XFL: Logical Reasoning-Based Explainable Federated Learning	AAAI	2024	N/A	``
Trade-Offs in Fine-Tuned Diffusion Models between Accuracy and Interpretability	AAAI	2024	N/A	``
π-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control	AAAI	2024	N/A	``
Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations	AAAI	2024	N/A	``
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention	AAAI	2024	N/A	``
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack	AAAI	2024	N/A	``
Learning Robust Rationales for Model Explainability: A Guidance-Based Approach	AAAI	2024	N/A	``
Explaining Generalization Power of a DNN Using Interactive Concepts	AAAI	2024	N/A	``
Federated Causality Learning with Explainable Adaptive Optimizatio	AAAI	2024	N/A	``
Learning Performance Maximizing Ensembles with Explainability Guarantees	AAAI	2024	N/A	``
Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction	AAAI	2024	N/A	``
Towards Learning and Explaining Indirect Causal Effects in Neural Networks	AAAI	2024	N/A	``
GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations	AAAI	2024	N/A	``
Pantypes: Diverse Representatives for Self-Explainable Models	AAAI	2024	N/A	``
Factorized Explainer for Graph Neural Networks	AAAI	2024	N/A	``
Self-Interpretable Graph Learning with Sufficient and Necessary Explanations	AAAI	2024	N/A	``
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning	AAAI	2024	N/A	``
A General Theoretical Framework for Learning Smallest Interpretable Models	AAAI	2024	N/A	``
Knowledge-Aware Explainable Reciprocal Recommendation	AAAI	2024	N/A	``
Fine-Tuning Large Language Model Based Explainable Recommendation with Explainable Quality Reward	AAAI	2024	N/A	``
Finding Interpretable Class-Specific Patterns through Efficient Neural Search	AAAI	2024	N/A	``
Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing	AAAI	2024	N/A	``
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation	AAAI	2024	N/A	``
A Convolutional Neural Network Interpretable Framework for Human Ventral Visual Pathway Representation	AAAI	2024	N/A	``
NeSyFOLD: A Framework for Interpretable Image Classification	AAAI	2024	N/A	``
Knowledge-Aware Neuron Interpretation for Scene Classification	AAAI	2024	N/A	``
MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment	AAAI	2024	N/A	``
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds	AAAI	2024	N/A	``
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction	Arxiv	2024	Code	``
[]()		2024	N/A	``
A Brain-Inspired Way of Reducing the Network Complexity via Concept-Regularized Coding for Emotion Recognition	AAAI	2024	N/A	``
Visual Chain-of-Thought Prompting for Knowledge-Based Visual Reasoning	AAAI	2024	N/A	``
Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning	AAAI	2024	N/A	``
PICNN: A Pathway towards Interpretable Convolutional Neural Networks	AAAI	2024	N/A	``
MagiCapture: High-Resolution Multi-Concept Portrait Customization	AAAI	2024	N/A	``
AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion	AAAI	2024	N/A	``
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA	AAAI	2024	N/A	``
ViTree: Single-Path Neural Tree for Step-Wise Interpretable Fine-Grained Visual Categorization	AAAI	2024	N/A	``
Text-to-Image Generation for Abstract Concepts	AAAI	2024	N/A	``
Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-Agnostic Framework Based on Counterfactual Inference	AAAI	2024	N/A	``
Set Prediction Guided by Semantic Concepts for Diverse Video Captioning	AAAI	2024	N/A	``
Understanding the Role of the Projector in Knowledge Distillation	AAAI	2024	N/A	``
Concept-Guided Prompt Learning for Generalization in Vision-Language Models	AAAI	2024	N/A	``
Automatic Core-Guided Reformulation via Constraint Explanation and Condition Learning	AAAI	2024	N/A	``
Learning to Pivot as a Smart Expert	AAAI	2024	N/A	``
Explainable Origin-Destination Crowd Flow Interpolation via Variational Multi-Modal Recurrent Graph Auto-Encoder	AAAI	2024	N/A	``
Explaining Reinforcement Learning Agents through Counterfactual Action Outcomes	AAAI	2024	N/A	``
Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision	AAAI	2024	N/A	``
Unsupervised Object Interaction Learning with Counterfactual Dynamics Models	AAAI	2024	N/A	``
NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models	CVPR	2024	N/A	``
ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations	CVPR	2024	N/A	``
Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding	CVPR	2024	N/A	``
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions	CVPR	2024	N/A	``
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models	CVPR	2024	N/A	``
Link-Context Learning for Multimodal LLMs	CVPR	2024	N/A	``
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users	CVPR	2024	N/A	``
Learning Structure-from-Motion with Graph Attention Networks	CVPR	2024	N/A	``
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding	CVPR	2024	N/A	``
Building Optimal Neural Architectures using Interpretable Knowledge	CVPR	2024	N/A	``
Understanding Video Transformers via Universal Concept Discovery	CVPR	2024	N/A	``
A Unified and Interpretable Emotion Representation and Expression Generation	CVPR	2024	N/A	``
Data Poisoning based Backdoor Attacks to Contrastive Learning	CVPR	2024	N/A	``
Are Logistic Models Really Interpretable?	IJCAI	2024	N/A	``
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability	IJCAI	2024	N/A	``
SGDCL: Semantic-Guided Dynamic Correlation Learning for Explainable Autonomous Driving	IJCAI	2024	N/A	``
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition	IJCAI	2024	N/A	``
Concept-Level Causal Explanation Method for Brain Function Network Classification	IJCAI	2024	N/A	``
Capturing Knowledge Graphs and Rules with Octagon Embeddings	IJCAI	2024	N/A	``
Constructive Interpolation and Concept-Based Beth Definability for Description Logics via Sequents	IJCAI	2024	N/A	``
"NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning"	IJCAI	2024	N/A	``
Cutting the Black Box: Conceptual Interpretation of a Deep Neural Net with Multi-Modal Embeddings and Multi-Criteria Decision Aid	IJCAI	2024	N/A	``
Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification	IJCAI	2024	N/A	``
Learning Causally Disentangled Representations via the Principle of Independent Causal Mechanisms	IJCAI	2024	N/A	``
EMOTE: An Explainable Architecture for Modelling the Other through Empathy	IJCAI	2024	N/A	``
SEMANTIFY: Unveiling Memes with Robust Interpretability beyond Input Attribution	IJCAI	2024	N/A	``
Learning Label Dependencies for Visual Information Extraction	IJCAI	2024	N/A	``
Unsupervised Concept Discovery Mitigates Spurious Correlations	ICML	2024	N/A	``
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning	ICML	2024	N/A	``
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation	ICML	2024	N/A	``
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning	ICML	2024	N/A	``
Understanding Inter-Concept Relationships in Concept-Based Models	ICML	2024	N/A	``
Towards Compositionality in Concept Learning	ICML	2024	N/A	``
Learning to Intervene on Concept Bottlenecks	ICML	2024	N/A	``
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models	ICML	2024	N/A	``
Position: Explain to Question not to Justify	ICML	2024	N/A	``
Explaining Graph Neural Networks via Structure-aware Interaction Index	ICML	2024	N/A	``
How Interpretable Are Interpretable Graph Neural Networks?	ICML	2024	N/A	``
SelfIE: Self-Interpretation of Large Language Model Embeddings	ICML	2024	N/A	``
On Mechanistic Knowledge Localization in Text-to-Image Generative Models	ICML	2024	N/A	``
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective	ICML	2024	N/A	``
How Learning by Reconstruction Produces Uninformative Features For Perception	ICML	2024	N/A	``
SelfIE: Self-Interpretation of Large Language Model Embeddings	ICML	2024	N/A	``
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks	ICML	2024	N/A	``
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations	ICML	2024	N/A	``
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation	ICML	2024	N/A	``
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments	ICML	2024	N/A	``
Explaining Probabilistic Models with Distributional Values	ICML	2024	N/A	``
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning	ICML	2024	N/A	``
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning	ICML	2024	N/A	``
Interpretability Illusions in the Generalization of Simplified Models	ICML	2024	N/A	``
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning	ICML	2024	N/A	``
Improving Interpretation Faithfulness for Vision Transformers	ICML	2024	N/A	``
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks	ICML	2024	N/A	``
Understanding the Learning Dynamics of Alignment with Human Feedback	ICML	2024	N/A	``
An Information-Theoretic Analysis of In-Context Learning	ICML	2024	N/A	``
Learning to Infer Generative Template Programs for Visual Concepts	ICML	2024	N/A	``
Learning Decision Trees and Forests with Algorithmic Recourse	ICML	2024	N/A	``
From Neurons to Neutrons: A Case Study in Interpretability	ICML	2024	N/A	``
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition	ICML	2024	N/A	``
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning	ICML	2024	N/A	``
Attention Meets Post-hoc Interpretability: A Mathematical Perspective	ICML	2024	N/A	``
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations	ICML	2024	N/A	``
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks	ICML	2024	N/A	``
Learning High-Order Relationships of Brain Regions	ICML	2024	N/A	``
Mechanistic Neural Networks for Scientific Machine Learning	ICML	2024	N/A	``
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input	ICML	2024	N/A	``
Codebook Features: Sparse and Discrete Interpretability for Neural Networks	ICML	2024	N/A	``
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features	ICML	2024	N/A	``
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions	ICML	2024	N/A	``
Explain Temporal Black-Box Models via Functional Decomposition	ICML	2024	N/A	``
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts	ICML	2024	N/A	``
Learning Causal Dynamics Models in Object-Oriented Environments	ICML	2024	N/A	``
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos	ICML	2024	N/A	``
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions	ICLR	2024	N/A	``
The mechanistic basis of data dependence and abrupt learning in an in-context classification task	ICLR	2024	N/A	``
Provable Compositional Generalization for Object-Centric Learning	ICLR	2024	N/A	``
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	ICLR	2024	N/A	``
"What Data Benefits My Classifier?" Enhancing Model Performance and Interpretability through Influence-Based Data Selection	ICLR	2024	N/A	``
Vision Transformers Need Registers	ICLR	2024	N/A	``
Robust agents learn causal world models	ICLR	2024	N/A	``
Detecting, Explaining, and Mitigating Memorization in Diffusion Models	ICLR	2024	N/A	``
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models	ICLR	2024	N/A	``
Interpreting CLIP's Image Representation via Text-Based Decomposition	ICLR	2024	N/A	``
Spot Check Equivalence: An Interpretable Metric for Information Elicitation Mechanisms	WWW	2024	N/A	``
EXGC: Bridging Efficiency and Explainability in Graph Condensation	WWW	2024	N/A	``
Adversarial Mask Explainer for Graph Neural Networks	WWW	2024	N/A	``
Globally Interpretable Graph Learning via Distribution Matching	WWW	2024	N/A	``
Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models	WWW	2024	N/A	``
A Method for Assessing Inference Patterns Captured by Embedding Models in Knowledge Graphs	WWW	2024	N/A	``
Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models	WWW	2024	N/A	``
NETEVOLVE: Social Network Forecasting using Multi-Agent Reinforcement Learning with Interpretable Features	WWW	2024	N/A	``
Invariant Graph Learning for Causal Effect Estimation	WWW	2024	N/A	``
Interpretable Knowledge Tracing with Multiscale State Representation	WWW	2024	N/A	``
Towards the Identifiability and Explainability for Personalized Learner Modeling: An Inductive Paradigm	WWW	2024	N/A	``
A Counterfactual Framework for Learning and Evaluating Explanations for Recommender Systems	WWW	2024	N/A	``
Learning Audio Concepts from Counterfactual Natural Language	ICASSP	2024	N/A	``
An Explainable Proxy Model for Multilabel Audio Segmentation	ICASSP	2024	N/A	``
Learning Ontology Informed Representations with Constraints for Acoustic Event Detection	ICASSP	2024	N/A	``
Predict and Interpret Health Risk Using Ehr Through Typical Patients	ICASSP	2024	N/A	``
Learning a Convex Patch-Based Synthesis Model via Deep Equilibrium	ICASSP	2024	N/A	``
Implicit-Knowledge-Guided Align Before Understanding for KB-VQA	ICASSP	2024	N/A	``
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks	ICASSP	2024	N/A	``
Improved Image Captioning Via Knowledge Graph-Augmented Models	ICASSP	2024	N/A	``
Interpretable Multimodal Out-of-Context Detection with Soft Logic Regularization	ICASSP	2024	N/A	``

rushrukh / awesome-explainable-ai

readme

Recent Publications in Explainable AI

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024