Closed Luvata closed 4 years ago
I want to finetune XLM-R language model on my additional monolingual dataset. After some research, I think my steps are:
prepare-iwslt17-multilingual.sh
fairseq-preprocess
fairseq-train
Preprocessing part seems to work correctly. But on training, I'm really confused when choosing tasks and model architecture
I see masked_lm, language_modeling and multilingual_masked_lm. In addition, in XLM-R README mentioned that
masked_lm
language_modeling
multilingual_masked_lm
xlmr.large | XLM-R using the BERT-large architecture
xlmr.large
But I also see XLM-R is a subclass of Roberta, so what --arch should I use ? roberta_large, bert_large and xlm ?
--arch
roberta_large
bert_large
xlm
Thank you in advance
Is your monolingual data in one of the language of xlmr's 100 languages?
xlmr
Yes, it's
Sorry, it's bit confusing. Use, --arch roberta_large and --task multilingual_masked_lm
--arch roberta_large
--task multilingual_masked_lm
❓ Questions and Help
I want to finetune XLM-R language model on my additional monolingual dataset. After some research, I think my steps are:
prepare-iwslt17-multilingual.sh
to encode my additional dataset using XLM-R's learned sentencepiece modelfairseq-preprocess
with XLM-R's dict.txt to binarize dataset from .bpe filefairseq-train
What is your question?
Preprocessing part seems to work correctly. But on training, I'm really confused when choosing tasks and model architecture
I see
masked_lm
,language_modeling
andmultilingual_masked_lm
. In addition, in XLM-R README mentioned thatBut I also see XLM-R is a subclass of Roberta, so what
--arch
should I use ?roberta_large
,bert_large
andxlm
?Thank you in advance