g8a9 / ferret

A python package for benchmarking interpretability techniques on Transformers.
https://ferret.readthedocs.io
MIT License
209 stars 16 forks source link
benchmarking interpretability machine-learning transformers

Ferret circular logo with the name to the right

Latest PyPI version Documentation Status HuggingFace Spaces Demo YouTube Video arxiv preprint downloads badge

ferret is Python library that streamlines the use and benchmarking of interpretability techniques on Transformers models.

ferret is meant to integrate seamlessly with 🤗 transformers models, among which it currently supports text models only. We provide:

ACL Anthology Bibkey:

attanasio-etal-2023-ferret

📝 Examples

All around tutorial (to test all explainers, evaluation metrics, and interface with XAI datasets): Colab

Text Classification

Getting Started

Installation

For the default installation, which does not include the dependencies for the speech XAI functionalities,

pip install -U ferret-xai

Our main dependencies are 🤗 tranformers and datasets.

If the speech XAI functionalities are needed, then

pip install -U ferret-xai[speech]

At the moment, the speech XAI-related dependencies are the only extra ones, so installing with ferret-xai[speech] or ferret-xai[all] is equivalent.

Important Some of our dependencies might use the package name for scikit-learn and that breaks ferret installation. \ If your pip install command fails, try:

SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True pip install -U ferret-xai

This is hopefully a temporary situation!

Explain & Benchmark

The code below provides a minimal example to run all the feature-attribution explainers supported by ferret and benchmark them on faithfulness metrics.

We start from a common text classification pipeline

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark

name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

Using ferret is as simple as:

bench = Benchmark(model, tokenizer)
explanations = bench.explain("You look stunning!", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)

bench.show_evaluation_table(evaluations)

Be sure to run the code in a Jupyter Notebook/Colab: the cell above will produce a nicely-formatted table to analyze the saliency maps.

Features

ferret offers a painless integration with Hugging Face models and naming conventions. If you are already using the transformers library, you immediately get access to our Explanation and Evaluation API.

Post-Hoc Explainers

Evaluation Metrics

Faithfulness measures:

Plausibility measures:

See our paper for details.

Visualization

The Benchmark class exposes easy-to-use table visualization methods (e.g., within Jupyter Notebooks)

bench = Benchmark(model, tokenizer)

# Pretty-print feature attribution scores by all supported explainers
explanations = bench.explain("You look stunning!")
bench.show_table(explanations)

# Pretty-print all the supported evaluation metrics
evaluations = bench.evaluate_explanations(explanations)
bench.show_evaluation_table(evaluations)

Dataset Evaluations

The Benchmark class has a handy method to compute and average our evaluation metrics across multiple samples from a dataset.

import numpy as np
bench = Benchmark(model, tokenizer)

# Compute and average evaluation scores one of the supported dataset
samples = np.arange(20)
hatexdata = bench.load_dataset("hatexplain")
sample_evaluations =  bench.evaluate_samples(hatexdata, samples)

# Pretty-print the results
bench.show_samples_evaluation_table(sample_evaluations)

Planned Developement

See the changelog file for further details.

Authors

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Logo and graphical assets made by Luca Attanasio.

If you are using ferret for your work, please consider citing us!

@inproceedings{attanasio-etal-2023-ferret,
    title = "ferret: a Framework for Benchmarking Explainers on Transformers",
    author = "Attanasio, Giuseppe and Pastor, Eliana and Di Bonaventura, Chiara and Nozza, Debora",
    booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
    month = may,
    year = "2023",
    publisher = "Association for Computational Linguistics",
}