ferret is Python library that streamlines the use and benchmarking of interpretability techniques on Transformers models.
ferret is meant to integrate seamlessly with 🤗 transformers models, among which it currently supports text models only. We provide:
ACL Anthology Bibkey:
attanasio-etal-2023-ferret
All around tutorial (to test all explainers, evaluation metrics, and interface with XAI datasets): Colab
Text Classification
For the default installation, which does not include the dependencies for the speech XAI functionalities,
pip install -U ferret-xai
Our main dependencies are 🤗 tranformers
and datasets
.
If the speech XAI functionalities are needed, then
pip install -U ferret-xai[speech]
At the moment, the speech XAI-related dependencies are the only extra ones, so installing with ferret-xai[speech]
or ferret-xai[all]
is equivalent.
Important Some of our dependencies might use the package name for scikit-learn
and that breaks ferret installation. \
If your pip install command fails, try:
SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True pip install -U ferret-xai
This is hopefully a temporary situation!
The code below provides a minimal example to run all the feature-attribution explainers supported by ferret and benchmark them on faithfulness metrics.
We start from a common text classification pipeline
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark
name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)
Using ferret is as simple as:
bench = Benchmark(model, tokenizer)
explanations = bench.explain("You look stunning!", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)
bench.show_evaluation_table(evaluations)
Be sure to run the code in a Jupyter Notebook/Colab: the cell above will produce a nicely-formatted table to analyze the saliency maps.
ferret offers a painless integration with Hugging Face models and naming conventions. If you are already using the transformers library, you immediately get access to our Explanation and Evaluation API.
Faithfulness measures:
Plausibility measures:
See our paper for details.
The Benchmark
class exposes easy-to-use table
visualization methods (e.g., within Jupyter Notebooks)
bench = Benchmark(model, tokenizer)
# Pretty-print feature attribution scores by all supported explainers
explanations = bench.explain("You look stunning!")
bench.show_table(explanations)
# Pretty-print all the supported evaluation metrics
evaluations = bench.evaluate_explanations(explanations)
bench.show_evaluation_table(evaluations)
The Benchmark
class has a handy method to compute and
average our evaluation metrics across multiple samples from a dataset.
import numpy as np
bench = Benchmark(model, tokenizer)
# Compute and average evaluation scores one of the supported dataset
samples = np.arange(20)
hatexdata = bench.load_dataset("hatexplain")
sample_evaluations = bench.evaluate_samples(hatexdata, samples)
# Pretty-print the results
bench.show_samples_evaluation_table(sample_evaluations)
See the changelog file for further details.
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Logo and graphical assets made by Luca Attanasio.
If you are using ferret for your work, please consider citing us!
@inproceedings{attanasio-etal-2023-ferret,
title = "ferret: a Framework for Benchmarking Explainers on Transformers",
author = "Attanasio, Giuseppe and Pastor, Eliana and Di Bonaventura, Chiara and Nozza, Debora",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
month = may,
year = "2023",
publisher = "Association for Computational Linguistics",
}