Question about using peft (LoRA)

uclaml / SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

https://uclaml.github.io/SPIN/

Apache License 2.0

1.05k stars 92 forks source link

Question about using peft (LoRA) #29

Closed JasonJiaxiangLi closed 7 months ago

JasonJiaxiangLi commented 7 months ago

Hi, thanks for the great project and repo. I'm wondering if you have tried any peft techniques such as LoRA?

I added the following three lines in config.yaml

use_peft: true lora_r: 16 lora_alpha: 16

and try to run finetune.sh but received the following error message:

Please let me know if you have any comments or observe the same error (or it's possible that there is something wrong with my device or I modified something...) Thanks!

JasonJiaxiangLi commented 7 months ago

Apparently it's because of this line in run_spin.py https://github.com/uclaml/SPIN/blob/e84b7be111b41b388367e591bdc23e327725c869/spin/run_spin.py#L144

It could work if I comment this out, or add a check whenever I call the red_model in trainer.py.