MAFALDA (Multi-level Annotated Fallacy Dataset)

Abstract

We introduce MAFALDA, a benchmark for fallacy classification that unites previous datasets. It comes with a taxonomy of fallacies that aligns, refines, and unifies previous classifications. We further provide a manual annotation of the dataset together with manual explanations for each annotation. We propose a new annotation scheme tailored for subjective NLP tasks, and a new evaluation method designed to handle subjectivity.
We then evaluate several language models under a zero-shot learning setting and human performances on MAFALDA to assess their fallacy detection and classification capability.

Paper Link

Installation

git clone https://github.com/ChadiHelwe/MAFALDA.git
cd  MAFALDA
pip install -r requirements.txt

Run Experiment with Dummy Models

./run_dummy.sh

Run Experiments with Local Models

with GPU

./run_with_gpu.sh

Run Experiments with OpenAI (GPT 3.5)

./run_with_openai.sh

Run Evaluation

./run_eval.sh

Citing

If you want to cite MAFALDA, please refer to the publication in the Conference of the North American Chapter of the Association for Computational Linguistics:

@inproceedings{helwe2023mafalda,
  title={MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and Classification},
  author={Helwe, Chadi and Calamai, Tom and Paris, Pierre-Henri and Clavel, Chlo{\'e} and Suchanek, Fabian},
  booktitle={Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
  year={2024}
}

Acknowledgments

This work was partially funded by the NoRDF project (ANR-20-CHIA-0012-01), the SINNet project (ANR-23-CE23-0033-01) and Amundi Technology.

ChadiHelwe / MAFALDA

readme