princeton-nlp / MeZO

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
MIT License
1.04k stars 63 forks source link

MeZO on continue pre-training #13

Open shan23chen opened 1 year ago

shan23chen commented 1 year ago

Hi!

Really nice work! I am wondering whether any have tried MeZO on continue pertaining LLM. If so, I am wondering the performance and insights are, thanks!

Shan

gaotianyu1350 commented 1 year ago

Hi Shan,

Thanks for your interest in our work! We haven't tried continue pre-training using MeZO but it should also work with MeZO. Would love to know the result if you try it out!