Masahiro Kaneko, Danushka Bollegala
Code and debiased word embeddings for the paper: "Debiasing Pre-trained Contextualised Embeddings" (In EACL 2021). If you use any part of this work, make sure you include the following citation:
@inproceedings{kaneko-bollegala-2021-context,
title={Debiasing Pre-trained Contextualised Embeddings},
author={Masahiro Kaneko and Danushka Bollegala},
booktitle = {Proc. of the 16th European Chapter of the Association for Computational Linguistics (EACL)},
year={2021}
}
cd transformers
pip install .
curl -o data/news-commentary-v15.en.gz -OL https://data.statmt.org/news-commentary/v15/training-monolingual/news-commentary-v15.en.gz
gunzip data/news-commentary-v15.en.gz
cd script
./preprocess.sh [bert/roberta/albert/dbert/electra] ../data/news-commentary-v15.en
./debias.sh [bert/roberta/albert/dbert/electra] gpu_id
You can directly download our all-token
debiased contextualised embeddings.
See the LICENSE file.