LLaVA-VL / LLaVA-NeXT

Apache License 2.0
2.4k stars 167 forks source link

Size Mismatch Issue in ‘mm_projector.bin’ for ‘llava-onevision-qwen2-0.5b-ov’ Model #177

Open fancy12335 opened 3 weeks ago

fancy12335 commented 3 weeks ago

I encountered the following error when running the ’finetune_onevision.sh‘ script using the ‘mm_projector.bin’ file provided at this link:

pretrain_mm_mlp_adapter:/home/LLaVA-NeXT-PROJECT/inputModel/llava-onevision-projectors/0.5b/mm_projector.bin
Traceback (most recent call last):
  File "/home/LLaVA-NeXT-PROJECT/LLaVA-NeXT/llava/train/train_mem.py", line 4, in <module>
    train()
  File "/home/LLaVA-NeXT-PROJECT/LLaVA-NeXT/llava/train/train.py", line 1549, in train
    model.get_model().initialize_vision_modules(model_args=model_args, fsdp=training_args.fsdp)
  File "/home/LLaVA-NeXT-PROJECT/LLaVA-NeXT/llava/model/llava_arch.py", line 115, in initialize_vision_modules
    incompatible_keys = self.mm_projector.load_state_dict(get_w(mm_projector_weights, "mm_projector"))
  File "/opt/conda/envs/llava_next/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Sequential:
        size mismatch for 0.weight: copying a param with shape torch.Size([896, 1152]) from checkpoint, the shape in current model is torch.Size([0]).
        size mismatch for 0.bias: copying a param with shape torch.Size([896]) from checkpoint, the shape in current model is torch.Size([0]).
        size mismatch for 2.weight: copying a param with shape torch.Size([896, 896]) from checkpoint, the shape in current model is torch.Size([0]).
        size mismatch for 2.bias: copying a param with shape torch.Size([896]) from checkpoint, the shape in current model is torch.Size([0]).
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

I have only modified the following in the ’finetune_onevision.sh‘ script: the model path (to the local path of the downloaded ’ llava-onevision-qwen2-0.5b-ov ‘ model), the dataset path, the ‘mm_projector.bin’ path (replacing it with the local path), and the output path for the fine-tuned model.

Could you please confirm if there is an issue with the provided mm_projector.bin file?

YerongLi commented 3 weeks ago

Where did you download the mm_projector,bin from?

Luodian commented 3 weeks ago

You should directly download them using the git lfs way.

The link is: https://huggingface.co/lmms-lab/llava-onevision-projectors/tree/main

Luodian commented 3 weeks ago

https://github.com/LLaVA-VL/LLaVA-NeXT/issues/145

fancy12335 commented 2 weeks ago

145

I indeed downloaded the projection layer weights from this link, and they matched the model, yet I still encountered this error. However, I have identified the issue and resolved it. While debugging, I noticed that at line 108 in LLaVA-NeXT/llava/model/llava_arch.py, self.mm_projector had a size of 0. I added the following line at line 109: self.mm_projector = build_vision_projector(self.config, vision_cfg=vision_tower.config), and this allowed the script to run correctly. I hope this change might be helpful to you as well. Thank you for your support.

shiweijiezero commented 2 weeks ago

145

I indeed downloaded the projection layer weights from this link, and they matched the model, yet I still encountered this error. However, I have identified the issue and resolved it. While debugging, I noticed that at line 108 in LLaVA-NeXT/llava/model/llava_arch.py, self.mm_projector had a size of 0. I added the following line at line 109: self.mm_projector = build_vision_projector(self.config, vision_cfg=vision_tower.config), and this allowed the script to run correctly. I hope this change might be helpful to you as well. Thank you for your support.

Thx, For me, that's work!

build_vision_projector is a simple init for Pytorch Sequential. Fundamentally speaking, This error occurred to Zero3 Rather than Zero2, Since Zero3 don't correct init the New Created Pytorch Sequential. Thus, their shape is [0].