Open ApoorvFrontera opened 3 months ago
According to this document BitsAndBytesConfig, all of the linear layers will be replaced by FP4/NF4
layers if setting load_4bit
. Therefore, it reports size mismatch error.
A temporary solution is to initialize a unquantified model, load projector weights, and save the whole model weights. The saved weights can be loaded successfully with load_4bit=True
.
model, processor, tokenizer = model_init('DAMO-NLP-SG/VideoLLaMA2-7B-Base')
model.config.tune_mm_mlp_adapter = False
model.save_pretrained('VideoLLaMA2-7B-full')
tokenizer.save_pretrained('VideoLLaMA2-7B-full')
model, processor, tokenizer = model_init('VideoLLaMA2-7B-full', load_4bit=True)
Hi VideoLLaMA Team,
I am facing issues while loading all the base models in 4-bit precision. The following lines try to load the
mm_projector_weights
which are stored in 16-bit precision into a model that requires the weights in 4bit leading to errors:Code used for loading the models for inference
model_path = 'DAMO-NLP-SG/VideoLLaMA2-8x7B-Base' model, processor, tokenizer = model_init(model_path, load_4bit=True)
Problematic part of the Code: Lines: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/__init__.py#L171-L172
mm_projector_weights = load_mm_projector(model_path, token=token) model.load_state_dict(mm_projector_weights, strict=False)
Error:
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM: size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
How can we use the 16-bit stored weights of the
mm_projector_weights
in 4-bit models?
Hello,I'm a phD student from ZJU, I also use videollama2 to do my own research,we create a WeChat group to discuss some issues of videollama2 and help each other,could you join us? Please contact me: WeChat number == LiangMeng19357260600, phone number == +86 19357260600,e-mail == liangmeng89@zju.edu.cn.
Hi VideoLLaMA Team,
I am facing issues while loading all the base models in 4-bit precision. The following lines try to load the
mm_projector_weights
which are stored in 16-bit precision into a model that requires the weights in 4bit leading to errors:Code used for loading the models for inference
Problematic part of the Code: Lines: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/__init__.py#L171-L172
Error:
How can we use the 16-bit stored weights of the
mm_projector_weights
in 4-bit models?