luogen1996 / LLaVA-HR

LLaVA-HR: High-Resolution Large Language-Vision Assistant
Apache License 2.0
202 stars 9 forks source link

Can we finetune LLaVA-HR with the scripts provided by the original LLaVA 1.5? #5

Open zengxingchen opened 6 months ago

zengxingchen commented 6 months ago

Great work! I wonder can we finetune your model with the scripts provided by the original LLaVA 1.5? Or Is there any guidance for conducting fine-tuning based on LLaVA-HR? Thx!

luogen1996 commented 6 months ago

You may need to check our scripts. Make sure that your scripts include newly added hyper-parameters like input_image_size.

However, I've never tried it, so I'm still not sure if it works.

zengxingchen commented 5 months ago

You may need to check our scripts. Make sure that your scripts include newly added hyper-parameters like input_image_size.

However, I've never tried it, so I'm still not sure if it works.

There are the reported errors: RuntimeError: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: RuntimeError Error(s) in loading state_dict for ConvNeXt: size mismatch for stem.0.weight: copying a param with shape torch.Size([384, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.0.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stem.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([0]). size mismatch for stages.0.blocks.0.weight: cop

zengxingchen commented 5 months ago
def _create_convnext(variant, pretrained=False, **kwargs):
    if kwargs.get('pretrained_cfg', '') == 'fcmae':
        # NOTE fcmae pretrained weights have no classifier or final norm-layer (`head.norm`)
        # This is workaround loading with num_classes=0 w/o removing norm-layer.
        kwargs.setdefault('pretrained_strict', False)

    model = build_model_with_cfg(
        ConvNeXt, variant, pretrained,
        pretrained_filter_fn=checkpoint_filter_fn,
        feature_cfg=dict(out_indices=(0, 1, 2, 3), flatten_sequential=True),
        **kwargs)
    return model

Mabe the issue arises from this function.