The tutorial was held on November 19th, 2020 on Zoom. The presenters were Eric Wallace, Matt Gardner, and Sameer Singh.
The PDF version of the slides are available here. The Google Drive version is here. Feel free to reuse any of our slides for your own purposes.
The video is available here.
Although neural NLP models are highly expressive and empirically successful, they also systematically fail in counterintuitive ways and are opaque in their decision-making process. This tutorial will provide a background on interpretation techniques, i.e., methods for explaining the predictions of NLP models. We will first situate example-specific interpretations in the context of other ways to understand models (e.g., probing, dataset analyses). Next, we will present a thorough study of example-specific interpretations, including saliency maps, input perturbations (e.g., LIME, input reduction), adversarial attacks, and influence functions. Alongside these descriptions, we will walk through source code that creates and visualizes interpretations for a diverse set of NLP tasks. Finally, we will discuss open problems in the field, e.g., evaluating, extending, and improving interpretation methods.
We you can find our tutorial overview paper in the conference proceedings.
If you'd like to cite our tutorial, you can use the following citation:
@inproceedings{wallace2020interpreting,
title={Interpreting Predictions of {NLP} Models},
author={Wallace, Eric and Gardner, Matt and Singh, Sameer},
booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts},
pages={20--23},
year={2020}
}