WZMIAOMIAO / deep-learning-for-image-processing

deep learning for image processing including classification and object-detection etc.
GNU General Public License v3.0
22.31k stars 7.9k forks source link

用MobileNetv2作为backbone训练Faster RCNN之后,进行Validation会报错 #823

Open Zhangjq7585 opened 2 months ago

Zhangjq7585 commented 2 months ago

博主您好,我在用MobileNetv2作为backbone训练Faster RCNN之后,进行Validation时,加载训练好的mobile-model-24.pth会报错。 Traceback (most recent call last): File "validation.py", line 215, in main(args) File "validation.py", line 140, in main model.load_state_dict(weights_dict,strict=False) File "/root/miniconda3/envs/pytorch1.12/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for FasterRCNN: size mismatch for rpn.head.conv.weight: copying a param with shape torch.Size([1280, 1280, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for rpn.head.conv.bias: copying a param with shape torch.Size([1280]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for rpn.head.cls_logits.weight: copying a param with shape torch.Size([15, 1280, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]). size mismatch for rpn.head.cls_logits.bias: copying a param with shape torch.Size([15]) from checkpoint, the shape in current model is torch.Size([3]). size mismatch for rpn.head.bbox_pred.weight: copying a param with shape torch.Size([60, 1280, 1, 1]) from checkpoint, the shape in current model is torch.Size([12, 256, 1, 1]). size mismatch for rpn.head.bbox_pred.bias: copying a param with shape torch.Size([60]) from checkpoint, the shape in current model is torch.Size([12]). size mismatch for roi_heads.box_head.fc6.weight: copying a param with shape torch.Size([1024, 62720]) from checkpoint, the shape in current model is torch.Size([1024, 12544]).

但是我用res50fpn作为backbone训练Faster RCNN之后,Validation的时候加载训练好的res50fpn.pth就没错。

360截图20240702102500702

Zhangjq7585 commented 1 month ago

现在又有一个新的问题,如下:

Using` cuda device training.
Using 0 dataloader workers
Traceback (most recent call last):
  File "validation.py", line 218, in <module>
    main(args)
  File "validation.py", line 131, in main
    model = FasterRCNN(backbone=backbone, num_classes=parser_data.num_classes + 1)
  File "/tmp/pytorch_multiFHsig_FasterRCNN/network_files/faster_rcnn_framework.py", line 267, in __init__
    raise ValueError(
ValueError: backbone should contain an attribute out_channelsspecifying the number of output channels  (assumed to be thesame for all the levels