CircleRadon / Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Apache License 2.0
754 stars 43 forks source link

Support for Zero3 or Zero3 Offload? Error when loading model state_dict #38

Open Z-MU-Z opened 1 month ago

Z-MU-Z commented 1 month ago

Hello,

I encountered an error while trying to load a model using the following code in [clip_encoder.py]

self.vision_tower.load_state_dict(torch.load(self.clip_model), strict=False)

The error message is as follows: [rank0]: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( [rank0]: RuntimeError: Error(s) in loading state_dict for CLIP: [rank0]: size mismatch for visual.trunk.stem.0.weight: copying a param with shape torch.Size([192, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([0])

This happens only when I use scripts/zero3_offload.json or scripts/zero3.json

LiWentomng commented 1 month ago

@Z-MU-Z Hello, our code currently does not support Zero3 for model training. We also face some unresolved issues. I recommend using Zero2 for now. We also welcome contributions from the community.