This repository is used in our paper:
$m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter (Findings of EMNLP 2022) paper link
Please cite our paper and kindly give a star for this repository if you use this code.
Requirements
adapters/m2m.py
to the adapter-transformer
pathscripts/preprocess
running baseline systems in Baselines
## run scripts in meta-train
deepspeed --num_gpus=8 main_deepspeed.py \
--domains EUbookshop,KDE4,OpenSubtitles,QED,TED2020,Ubuntu \
--model_name_or_path facebook/m2m100_418M \
--data_path YOUR_DATA_PATH \
--per_device_train_batch_size 4 --per_device_eval_batch_size 4 \
--cuda --log_interval 20 --temp 5.0
## run scripts in meta-adapt
nohup python meta_adapt.py \
--device_id 6 \
--domain EUbookshop \
--finetune_scripts finetune_meta_adapter_unseen.py \
--generate_scripts generate_base_adapter.py \
--data_path YOUR_DATA_PATH \
--adapter_path ADAPTER_PATH \
--save_path MODEL_PATH \
--generate_path GENERATE_PATH \
> log.txt &
@inproceedings{lai-etal-2022-4,
title = "m{\textasciicircum}4 Adapter: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter",
author = "Lai, Wen and
Chronopoulou, Alexandra and
Fraser, Alexander",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-emnlp.315",
pages = "4282--4296",
abstract = "Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair seen at training time. However, when a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We consider a very challenging scenario: adapting the MNMT model both to a new domain and to a new language pair at the same time. In this paper, we propose m{\textasciicircum}4Adapter (Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter), which combines domain and language knowledge using meta-learning with adapters. We present results showing that our approach is a parameter-efficient solution which effectively adapts a model to both a new language pair and a new domain, while outperforming other adapter methods. An ablation study also shows that our approach more effectively transfers domain knowledge across different languages and language information across different domains.",
}