lartpang / ZoomNeXt

ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection (TPAMI 2024)
https://arxiv.org/abs/2310.20208
25 stars 4 forks source link

VCOD Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt #6

Closed RandomStar14 closed 1 month ago

RandomStar14 commented 1 month ago

您好,在使用您提供的指令预训练并保存预训练权重后,我尝试使用您提供的对于VCOD的微调指令:

python main_for_video.py --config configs/vcod_finetune.py --info finetune --model-name videoPvtV2B5_ZoomNeXt --load-from PvtV2B5_ZoomNeXt_BS4_LR0.0001_E150_H384_W384_OPMadam_OPGMfinetune_SCstep_AMP_INFOpretrain/exp_0/pth/state_final.pth

在执行 io.load_weight(model=model, load_path=cfg.load_from, strict=True)加载预训练权重时出现了以下错误:

RuntimeError: Error(s) in loading state_dict for videoPvtV2B5_ZoomNeXt:
        size mismatch for hmu_5.fuse.0.temperal_proj_kv.weight: copying a param with shape torch.Size([2, 1]) from checkpoint, the shape in current model is torch.Size([10, 5]).
        size mismatch for hmu_5.fuse.0.temperal_proj.0.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        size mismatch for hmu_5.fuse.0.temperal_proj.2.weight: copying a param with shape torch.Size([1, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([5, 5, 3, 3]).
        ......

请问是我加载权重的方式有误,还是需要在加载时改为io.load_weight(model=model, load_path=cfg.load_from, strict=False, skip_unmatched_shape = True) 进行非严格的权重加载(这样可以成功进行训练)

感谢您出色的工作!

lartpang commented 1 month ago

@RandomStar14

是这样的,这里代码没改好。因为预训练使用的实际上是单帧,但是视频时模型调整成了多帧对应的参数形状,参数上有点不兼容了。

RandomStar14 commented 1 month ago

好的,谢谢您