huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
15.98k stars 1.56k forks source link

merge_and_unload docs do not clarify behaviour for quantized base models #2105

Open RonanKMcGovern opened 3 days ago

RonanKMcGovern commented 3 days ago

System Info

NA

Who can help?

@BenjaminBossan could you add a note to the docs to explain the default behaviour, and also any work arounds (e.g. loading a base model and dequantizing and loading the adapter and then merging) for best performance? Thanks

https://huggingface.co/docs/peft/main/en/package_reference/lora#peft.LoraModel.merge_and_unload

Information

Tasks

Reproduction

NA

Expected behavior

NA

BenjaminBossan commented 3 days ago

I agree that the information is a bit sparse. Could you expand on what exactly you would like to see? What is the workaround for that you mentioned, do you mean quantization methods that don't support merging? What type of performance do you have in mind?