facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.
Other
2.87k stars 495 forks source link

XLM-R fine-tune in MLQA dataset #312

Open ztl-35 opened 4 years ago

ztl-35 commented 4 years ago

Hi, I am doing MLQA task(https://arxiv.org/abs/1910.07475) and I want to know how to fine-tune your XLM-R model in this task. In google research repository of BERT, it has a source code file(run_squad.py: https://github.com/google-research/bert/blob/master/run_squad.py) to tell us how to fine-tune extractive-MRC task. In your XLM-R paper, it gives us three links to find code, but these codes aren't telling us how to fine-tune MLQA task specifically and these codes are all encapsulated in a toolkit(fairseq pytext). Thanks!

tnq177 commented 2 years ago

@ztl-35 I'm looking for this too. I think this paper only do zero-shot evaluation in which the XLM-R model is finetuned on the English SQUAD dataset then used to evaluate on the rest of the languages. They mentioned they followed the procedure for XLT in https://arxiv.org/pdf/1910.07475.pdf. That paper then links to https://aclanthology.org/N19-1423.pdf, see SQUAD, which says "We fine-tune for 3 epochs with a learning rate of 5e-5 and a batch size of 32". For the other tasks which required translated version of SQUAD I think they include here https://github.com/facebookresearch/mlqa#translate-train-and-translate-test-data

tnq177 commented 2 years ago

I just used huggingface to get xlm-r & tried to reproduce the one-shot results. The zh one is very off, turn out we need to tweak the evaluation https://github.com/huggingface/transformers/issues/3510