facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.
Other
2.87k stars 495 forks source link

Why batch of same language? #311

Open hwijeen opened 4 years ago

hwijeen commented 4 years ago

Hi, can I ask a question that is not necessarily code-related? I've been thinking about this question for long, but still haven't found an answer. Would appreciate your help!

It is mentioned in the XLM paper, At each iteration, a batch is composed of sentences coming from the same language, which is sampled from the distribution { q i } i=1...N above, with α = 0.7.

Was there a theoretical or practical reason for doing this? I think a naive way would be to concatenate data of all languages, and randomly sample a batch so that the batch contains samples of different languages. Thank you in advance!