fe1ixxu / ALMA

State-of-the-art LLM-based translation models.
MIT License
439 stars 35 forks source link

About the interleave probability selections #4

Closed Aniruddha-JU closed 1 year ago

Aniruddha-JU commented 1 year ago

Congrats on the great work and thanks for sharing the nice Github repo!

I have one question how do you decide the interleave probability percentage? Do you follow any rules or previous work?

Aniruddha-JU commented 1 year ago

I am asking for the stage 1 pre-training

fe1ixxu commented 1 year ago

Thanks for your interest!! Please find the reasons for interleave probability selection for stage 1 in Appendix D.1 in the paper!