facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.43k stars 6.4k forks source link

Confuse when finetune XLM-R as a language model on a monolingual dataset #1724

Closed Luvata closed 4 years ago

Luvata commented 4 years ago

❓ Questions and Help

I want to finetune XLM-R language model on my additional monolingual dataset. After some research, I think my steps are:

What is your question?

Preprocessing part seems to work correctly. But on training, I'm really confused when choosing tasks and model architecture

I see masked_lm, language_modeling and multilingual_masked_lm. In addition, in XLM-R README mentioned that

xlmr.large | XLM-R using the BERT-large architecture

But I also see XLM-R is a subclass of Roberta, so what --arch should I use ? roberta_large, bert_large and xlm ?

Thank you in advance

ngoyal2707 commented 4 years ago

Is your monolingual data in one of the language of xlmr's 100 languages?

Luvata commented 4 years ago

Yes, it's

ngoyal2707 commented 4 years ago

Sorry, it's bit confusing. Use, --arch roberta_large and --task multilingual_masked_lm