huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
16.46k stars 1.62k forks source link

[bug fix] ensure `base_model` is correctly set in model card #2124

Closed valayDave closed 1 month ago

valayDave commented 1 month ago

TLDR:

When we train LoRA's on top of a model loaded from disk, SFTTrainer.push_to_hub fails because of the model card created by Trainer.create_model_card. The Trainer.create_model_card calls the peft library's create_or_update_model_card which overrides the base_model to the path of the model on disk even if the model card created by Trainer.create_model_card might have contained the value of the base_model.

Long Form Context :

The model card's parsing is an essential when pushing the model to hub. If the model card doesn't contain information that HF finds valid, then HF hub raises a ValueError. One of the fields in the model_card injected by the peft library is the base_model field. This field is set earlier using the model_config["_name_or_path"]. A problem occurs when the model_config["_name_or_path"] is not a model name on HuggingfaceHub but rather just a path on local file system. At this point the hf_hub's folder_upload method crashes. Now if the card already contained a base_modelset then uploading the model from disk is still possible. One way to set the base_model's value is to pass the SFTTrainer.push_to_hub/Trainer.push_to_hub kwargs which are passed down to the create_model_card function. These kwargs can contain a field called finetuned_from which allow the model card creation properly.

BenjaminBossan commented 1 month ago

Thanks for this PR. Your suggestion and explanation sound reasonable.

Before proceeding, do you have a small code example to reproduce the current error? I'm asking because I wonder if we can craft a unit test based on that. It's always good to have a unit test to accompany a bug fix so that the same bug cannot later be re-introduced by accident.

valayDave commented 1 month ago

Okay, I think this issue got resolved for me as I explained over here https://github.com/huggingface/transformers/issues/33922. Overall since TRL changed the create_model_card function, I was able to make this work without needing the change in PEFT.