VCOD Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt

RandomStar14 commented 1 month ago

您好，在使用您提供的指令预训练并保存预训练权重后，我尝试使用您提供的对于VCOD的微调指令：

python main_for_video.py --config configs/vcod_finetune.py --info finetune --model-name videoPvtV2B5_ZoomNeXt --load-from PvtV2B5_ZoomNeXt_BS4_LR0.0001_E150_H384_W384_OPMadam_OPGMfinetune_SCstep_AMP_INFOpretrain/exp_0/pth/state_final.pth

在执行 io.load_weight(model=model, load_path=cfg.load_from, strict=True)加载预训练权重时出现了以下错误：

RuntimeError: Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt:
        size mismatch for hmu_5.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_5.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_5.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        ......

请问是我加载权重的方式有误，还是需要在加载时改为io.load_weight(model=model, load_path=cfg.load_from, strict=False, skip_unmatched_shape = True) 进行非严格的权重加载（这样可以成功进行训练）

感谢您出色的工作！

lartpang commented 1 month ago

@RandomStar14

是这样的，这里代码没改好。因为预训练使用的实际上是单帧，但是视频时模型调整成了多帧对应的参数形状，参数上有点不兼容了。

RandomStar14 commented 1 month ago

好的，谢谢您

lartpang / ZoomNeXt

VCOD Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt #6