huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

Is QLoRA better than finetuning? #98

Open normster opened 6 months ago

normster commented 6 months ago

The results reported in https://github.com/huggingface/alignment-handbook/pull/88 suggest that QLoRA is better for both SFT and DPO. Is this accurate, and have people seen this happen in any other settings?