Closed DaehanKim closed 1 year ago
Hi! Thanks for the interest! Our latest results show that when applied to 350M model, ReLoRA can perform just as well as regular training. We are currently scaling up ReLoRA to test it on Pythia-1B with Eleuther.Ai. You can see the 1B code in dev branch.
Hi! Thanks for the interest! Our latest results show that when applied to 350M model, ReLoRA can perform just as well as regular training. We are currently scaling up ReLoRA to test it on Pythia-1B with Eleuther.Ai. You can see the 1B code in dev branch.
Hey~ what is the result of the Pythia-1B?
6b model seem not work
hi! Thank you for this insightful work. I wonder how your research on large language models is going on. Is your 350M run finished? also I'm curious about how good billion scale models can be. Do you have any plans to scale up your experiments?