Error Loading Parameters for Slow Vision Tower During Fine-Tuning on Custom Dataset

Hello, I hope this message finds you well.

I encountered an issue while fine-tuning the model on my custom dataset. Specifically, I removed the pretrain_mm_mlp_adapter parameter from the script. During the fine-tuning process, I observed an error related to the loading of parameters for the slow vision tower, indicating that they are empty. The error message is as follows:

Traceback (most recent call last): File "/abc/LLaVA-HR/llava_hr/train/train_mem.py", line 13, in self.vision_tower = convnext_large_mlp(self.vision_tower_name) File "/abc/LLaVA-HR/llava_hr/model/multimodal_encoder/convnext.py", line 1006, in convnext_large_mlp train() File "/abc/LLaVA-HR/llava_hr/train/train.py", line 1010, in train model = _create_convnext('convnext_large_mlp', pretrained=pretrained, dict(model_args, kwargs)) File "/abc/LLaVA-HR/llava_hr/model/multimodal_encoder/convnext.py", line 494, in _create_convnext model = LlavaLlamaForCausalLM.from_pretrained( File "/abc/envs/llava-hr/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained model = build_model_with_cfg( File "/abc/envs/llava-hr/lib/python3.10/site-packages/timm/models/_builder.py", line 397, in build_model_with_cfg load_pretrained( File "/abc/envs/llava-hr/lib/python3.10/site-packages/timm/models/_builder.py", line 237, in load_pretrained model = cls(config, *model_args, *model_kwargs) File "/abc/envs/llava-hr/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 385, in wrapper model.load_state_dict(state_dict, strict=strict) File "/abc/envs/llava-hr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict f(module, args, **kwargs) File "/abc/LLaVA-HR/llava_hr/model/language_model/llava_llama.py", line 46, in init raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( self.model = LlavaLlamaModel(config) RuntimeError File "/abc/envs/llava-hr/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 385, in wrapper : Error(s) in loading state_dict for ConvNeXt: size mismatch for stem.0.weight: copying a param with shape torch.Size([192, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.0.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.1.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stages.0.blocks.0.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stages.0.blocks.0.conv_dw.weight: copying a param with shape torch.Size([192, 1, 7, 7]) from checkpoint, the shape in current model is torch.Size([0]).

Could you please provide guidance on how to resolve this issue? Thank you sincerely for your time and consideration.

Warm regards,

luogen1996 / LLaVA-HR

Error Loading Parameters for Slow Vision Tower During Fine-Tuning on Custom Dataset #15