Open jjbuck opened 2 weeks ago
Same Issue, even while saving the processor getting issue like 'Phi3VProcessor' object has no attribute 'chat_template'
@ChenRocks any ideas on this?
For visibility, a contributor to a forked version of the Phi 3 Vision cookbook suggested the following solution, stating "You need to remove the wte weight. It's okay because when the model is loaded from the checkpoint, it will automatically copy the weight from the embedding weight."
state_dict = model.state_dict()
state_dict = {k:v for k, v in state_dict.items() if "wte" not in k}
model.save_pretrained(args.save_model_path, state_dict=state_dict, safe_serialization=True)
processor.save_pretrained(args.save_model_path)
This does indeed seem to work. However, it doesn't exactly fit into a use case that relies on the Trainer
abstraction from the transformers
library. The call to the Trainer
class's save_model()
method doesn't accommodate a state dictionary argument (see https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L3719-L3768).
I opened a feature request against the transformers
library to see if there's any opportunity to resolve this on their end. I'm admittedly not familiar enough with the inner workings of transformers
of Phi 3/3.5 Vision, but it seems to me that potentially one of three solutions is needed.
state_dict
argument in the Trainer
class's save_model()
method (https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L3719-L3768). This state_dict
parameter should then be passed down to the call to the private _save()
method (https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L3842), which does accept a state_dict
argument.state_dict
as an argument to save_model()
, determine the appropriate heuristic such that we can successfully save Phi 3.5 Vision and other architecturally similar models.transformers
handles shared tensors...?Can you folks here (@leestott @ChenRocks ) comment on idea #2 above? Is there a general abstraction inherent to the architecture of Phi 3/3.5 Vision around which we could develop a heuristic (i.e., to avoid a the naive conditional if microsoft/Phi-3.5-vision-instruct: <do something with state_dict>
?
@jjbuck @vjagannath786 @leestott https://github.com/2U1/Phi3-Vision-Finetune/tree/main
You could check on to the code I made. The model could be saved by removing the wte weight in the Trainer class too.
You could inherit the Trainer
class in the transformer and fix the method _save
the way I've did it.
https://github.com/2U1/Phi3-Vision-Finetune/blob/406eafbbf8c6d84d2a3cc0878376db0a86c39af2/src/training/trainer.py#L205-L210 This is the exact line of the code.
I’m trying to fine-tune Phi 3.5 Vision using transformers. However, I’m running into an issue trying to save the model during or after training. See below for a minimal reproducible example.
My example below seems to be essentially what's happening in the official "cookbook" example: https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/vision_finetuning/finetune_hf_trainer_docvqa.py#L482-L485.
However, I also see from this other example (https://github.com/microsoft/Phi-3CookBook/blob/6566572c38d53f384801a09dabdd26ad4f7bf76a/code/04.Finetuning/Phi-3-vision-Trainingscript.py#L256) that
safe_serialization=False
is used....is that strictly required? The example fromfinetune_hf_trainer_docvqa.py
doesn't seem to use it, and it's not clear to me how that works successfully.Does anyone have any pointers? This issue has been reported in a few other locations, but I haven't come across any solutions - see below.
The error suggests “saving using safe_serialization=False”…but I’m not sure what the implications of that are.
Minimal Reproducible Example