VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Apache License 2.0
717
stars
48
forks
source link
Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
I have debugged and understood the reason that when we pass in the load_4bit=True, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but the mm_projector does not (I am guessing it is initialized from the random values but in Params4bit class).
Now to initialize the weights for the mm_projector, we try to load it from the previously saved non_lora_trainables.bin which was stored in full precision format.
Hi Team,
I have successfully finetuned a QLoRA adapter on a custom dataset. When I try to load it in full precision, it gets loaded and works well
But this takes too much time and GPU memory to run inference. So I wanted to load the model in 4-bit precision. So I pass in the load_4bit parameter:
While running this I get the following error:
I have debugged and understood the reason that when we pass in the
load_4bit=True
, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but themm_projector
does not (I am guessing it is initialized from the random values but in Params4bit class).Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L76
model = Videollama2MistralForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=lora_cfg_pretrained, **kwargs)
Now to initialize the weights for the
mm_projector
, we try to load it from the previously savednon_lora_trainables.bin
which was stored in full precision format.Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L101
model.load_state_dict(non_lora_trainables, strict=False)
Debugging Outputs
Model params
Previously saved non_lora_trainables param
Please advise on how to resolve this.