facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
MIT License
9.3k stars 2.49k forks source link

size mismatch #360

Open CF2220160244 opened 5 years ago

CF2220160244 commented 5 years ago

i run with : python tools/train_net.py --config-file "/home/chenfei/github/maskrcnn benchmark/configs/e2e_mask_rcnn_R_50_FPN_1x.yaml"

can you help me with this error:

... 2019-01-21 14:21:04,199 maskrcnn_benchmark.utils.checkpoint INFO: Loading checkpoint from ./model_0005000.pth ...

Traceback (most recent call last): File "tools/train_net.py", line 171, in main() File "tools/train_net.py", line 164, in main model = train(cfg, args.local_rank, args.distributed) File "tools/train_net.py", line 53, in train extra_checkpoint_data = checkpointer.load(cfg.MODEL.WEIGHT) File "/home/chenfei/github/maskrcnn-benchmark/maskrcnn_benchmark/utils/checkpoint.py", line 62, in load self._load_model(checkpoint) File "/home/chenfei/github/maskrcnn-benchmark/maskrcnn_benchmark/utils/checkpoint.py", line 97, in _load_model load_state_dict(self.model, checkpoint.pop("model")) File "/home/chenfei/github/maskrcnn-benchmark/maskrcnn_benchmark/utils/model_serialization.py", line 80, in load_state_dict model.load_state_dict(model_state_dict) File "/home/chenfei/anaconda3/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 759, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for GeneralizedRCNN: size mismatch for rpn.anchor_generator.cell_anchors.0: copying a param with shape torch.Size([15, 4]) from checkpoint, the shape in current model is torch.Size([3, 4]). size mismatch for rpn.head.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for rpn.head.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for rpn.head.cls_logits.weight: copying a param with shape torch.Size([15, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]). size mismatch for rpn.head.cls_logits.bias: copying a param with shape torch.Size([15]) from checkpoint, the shape in current model is torch.Size([3]). size mismatch for rpn.head.bbox_pred.weight: copying a param with shape torch.Size([60, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([12, 256, 1, 1]). size mismatch for rpn.head.bbox_pred.bias: copying a param with shape torch.Size([60]) from checkpoint, the shape in current model is torch.Size([12]). size mismatch for roi_heads.box.predictor.cls_score.weight: copying a param with shape torch.Size([81, 2048]) from checkpoint, the shape in current model is torch.Size([81, 1024]). size mismatch for roi_heads.box.predictor.bbox_pred.weight: copying a param with shape torch.Size([324, 2048]) from checkpoint, the shape in current model is torch.Size([324, 1024]).

fmassa commented 5 years ago

Looks like you are loading a checkpoint from a different model config?

fmassa commented 5 years ago

This might happen if your OUTPUT_DIR is exactly the same as the one from a previous run, and there is a last_checkpoint file.

msn321 commented 4 years ago

@CF2220160244 Hello, how did you solve this problem?

chenpeng68 commented 3 years ago

@fmassa > This might happen if your OUTPUT_DIR is exactly the same as the one from a previous run, and there is a last_checkpoint file.

I don't understand , can you please clarify?