Closed Aniruddha-JU closed 1 year ago
Congrats on the great work and thanks for sharing the nice Github repo!
I have one question how do you decide the interleave probability percentage? Do you follow any rules or previous work?
I am asking for the stage 1 pre-training
Thanks for your interest!! Please find the reasons for interleave probability selection for stage 1 in Appendix D.1 in the paper!
Congrats on the great work and thanks for sharing the nice Github repo!
I have one question how do you decide the interleave probability percentage? Do you follow any rules or previous work?