Closed unnir closed 1 year ago
I created a PR for this purpose. I think one problem is that for each LM, the (LoraConfig) is different, my current one is working with GPT2 model (which is the demo given in the readme), should have a more general way to adapt that.
For the experiment, I tested on a small dataset with distilgpt2, one thing noticeable is that the memory usage is reduced about 50% (on current r setting in LoraConfig), training time per epoch is also reduced by about 25%. I didn't test the table quality yet. To be continued.
Hi @zhao-zilong,
Thank you for your PR!
I apologize for the late response. I had my phd defense and then a lot of movements between countries and bars.
Regarding your PR, my vision is that it will be an optional feature, if a user wants to speed up fine-tuning/training, she/he can use LoRA, but by default it will be the normal training.
I'm happy to accept your pull requests if you make these changes. Something like this:
model = GReaT(llm='distilgpt2', batch_size=32, epochs=50, efficient_finetuning='lora')
model.fit(data)
synthetic_data = model.sample(n_samples=100)
Hi @unnir I updated the great.py code in PR, have a look.
And forgot to say, congratulations!
Thank you, Zhao :)
Approved and merged, thank you for your contribution! I will do tests :)
The incorporation of the LoRA technique will enhance the speed of fine-tuning and decrease computational demands. Contributions are welcome.
About the LoRA: https://www.philschmid.de/fine-tune-flan-t5-peft