Cornell-RelaxML / quip-sharp

GNU General Public License v3.0
508 stars 44 forks source link

e2e finetuning requirement #79

Open dorsa-zeinali opened 5 days ago

dorsa-zeinali commented 5 days ago

hi, what context size and devset size do you think is reasonable for the e2e finetuning step given that I have 1 gpu with 48GB? thank you so much

tsengalb99 commented 5 days ago

I would use the context length of the original model. IIRC the devset is stored in cpu memory so that shouldn't be an issue for gpu memory. You may want to look at my comment in the other thread; you should be able to easily rewrite the e2e fine tuning script to be more memory efficient. Also, if you're quantizing to 4 bits, e2e doesn't really do anything so I wouldn't bother with it there.