Open tnq177 opened 5 years ago
@tnq177 : sorry for the late reply. We don't plan to release this model no but it basically requires to 1) download the Nepali Wikipedia through the bash files provided in the repo 2) running a "CLM" model with the help of the README. That should lead to the same results.
@aconneau Thanks for your reply. I did the same thing. If I understood the paper correctly, you also used BPE for that set of experiments, right? I used your provided multilingual BPE codes and got dev ppl of ~19 (nepali) and ~16.5 (when adding hindi data). This is different from your reported ppl of 157 and 115. The later sounds like word-level ppl doesn't it? Thanks.
Do you plan to release the pretrained models in section 4.4 Low-resource language modeling? If not, how to reproduce this set of experiments please?