CZWin32768 / XNLG

AAAI-20 paper: Cross-Lingual Natural Language Generation via Pre-Training
https://arxiv.org/abs/1909.10481
128 stars 16 forks source link

Code implementation for XLM-E #19

Open aloka-fernando opened 1 year ago

aloka-fernando commented 1 year ago

I need to conduct pre-training using the pre-training objectives used by XLM-E. Can I please have the pre-processing command and the pre-training command? Thank you.

CZWin32768 commented 1 year ago

Thank you for your comments and interest in our work. As the work was done during my internship at Microsoft, I am not able to share the code without authorization due to company policy. Nonetheless, I would like to answer your questions if you have specific problems on pre-processing or pre-training.

aloka-fernando commented 1 year ago

Thanks a lot for your response. Would you think I can recreate your work considering ELECTRA (Clark et al, 2020) or would there be any person whom I can discuss regarding getting the codebase?

I wish to confirm, interms of differences between XLM-E and ELECTRA. Other than XLM-E being multilingual and using parallel data in with the Translation Replace Token Detection (TRTD) what seems to be major differences?

In my research I am using parallel data to improve cross-lingual representation of multilingual pre-trained models. I am focusing on Low resource languages. In what ways would you think your work in XLM-E paper can be extended while still using parallel data ? Thank you.