Closed valayDave closed 1 month ago
Thanks for this PR. Your suggestion and explanation sound reasonable.
Before proceeding, do you have a small code example to reproduce the current error? I'm asking because I wonder if we can craft a unit test based on that. It's always good to have a unit test to accompany a bug fix so that the same bug cannot later be re-introduced by accident.
Okay, I think this issue got resolved for me as I explained over here https://github.com/huggingface/transformers/issues/33922. Overall since TRL changed the create_model_card
function, I was able to make this work without needing the change in PEFT.
TLDR:
When we train LoRA's on top of a model loaded from disk,
SFTTrainer.push_to_hub
fails because of the model card created byTrainer.create_model_card
. TheTrainer.create_model_card
calls the peft library'screate_or_update_model_card
which overrides thebase_model
to the path of the model on disk even if the model card created byTrainer.create_model_card
might have contained the value of thebase_model
.Long Form Context :
The model card's parsing is an essential when pushing the model to hub. If the model card doesn't contain information that HF finds valid, then HF hub raises a ValueError. One of the fields in the model_card injected by the
peft
library is thebase_model
field. This field is set earlier using themodel_config["_name_or_path"]
. A problem occurs when themodel_config["_name_or_path"]
is not a model name on HuggingfaceHub but rather just a path on local file system. At this point the hf_hub'sfolder_upload
method crashes. Now if the card already contained abase_model
set then uploading the model from disk is still possible. One way to set thebase_model
's value is to pass theSFTTrainer.push_to_hub
/Trainer.push_to_hub
kwargs
which are passed down to thecreate_model_card
function. Thesekwargs
can contain a field calledfinetuned_from
which allow the model card creation properly.