open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.61k stars 9.47k forks source link

RuntimeError: CUDA error: unknown error #461

Closed coolKeen closed 3 years ago

coolKeen commented 5 years ago

Hi, everyone. When I run the command "CUDA_VISIBLE_DEVICES=2,3 python tools/train.py configs/dcn/cascade_rcnn_dconv_c3-c5_r50_fpn_1x.py --gpus 2", I get the error as below:

(openmmlab) wangqing@ius:~/heheda/openmmlab/mmdetection$ CUDA_VISIBLE_DEVICES=2,3 python tools/train.py configs/dcn/cascade_rcnn_dconv_c3-c5_r50_fpn_1x.py --gpus 2 2019-04-03 11:49:37,256 - INFO - Distributed training: False 2019-04-03 11:49:37,748 - INFO - load model from: /home/wangqing/heheda/openmmlab/mmdetection/pretrained_models/cascade_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-dfa53166.pth 2019-04-03 11:49:37,881 - WARNING - unexpected key in source state_dict: backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.conv2_offset.weight, backbone.layer2.0.conv2_offset.bias, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.conv2_offset.weight, backbone.layer2.1.conv2_offset.bias, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.bn2.num_batches_tracked, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.1.bn3.num_batches_tracked, backbone.layer2.2.conv1.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.bn1.num_batches_tracked, backbone.layer2.2.conv2_offset.weight, backbone.layer2.2.conv2_offset.bias, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.bn2.num_batches_tracked, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.2.bn3.num_batches_tracked, backbone.layer2.3.conv1.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.bn1.num_batches_tracked, backbone.layer2.3.conv2_offset.weight, backbone.layer2.3.conv2_offset.bias, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.bn2.num_batches_tracked, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer2.3.bn3.num_batches_tracked, backbone.layer3.0.conv1.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.bn1.num_batches_tracked, backbone.layer3.0.conv2_offset.weight, backbone.layer3.0.conv2_offset.bias, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.bn2.num_batches_tracked, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.bn3.num_batches_tracked, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.0.downsample.1.num_batches_tracked, backbone.layer3.1.conv1.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.bn1.num_batches_tracked, backbone.layer3.1.conv2_offset.weight, backbone.layer3.1.conv2_offset.bias, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.bn2.num_batches_tracked, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.1.bn3.num_batches_tracked, backbone.layer3.2.conv1.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.bn1.num_batches_tracked, backbone.layer3.2.conv2_offset.weight, backbone.layer3.2.conv2_offset.bias, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.bn2.num_batches_tracked, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.2.bn3.num_batches_tracked, backbone.layer3.3.conv1.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.bn1.num_batches_tracked, backbone.layer3.3.conv2_offset.weight, backbone.layer3.3.conv2_offset.bias, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.bn2.num_batches_tracked, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.3.bn3.num_batches_tracked, backbone.layer3.4.conv1.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.bn1.num_batches_tracked, backbone.layer3.4.conv2_offset.weight, backbone.layer3.4.conv2_offset.bias, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.bn2.num_batches_tracked, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.4.bn3.num_batches_tracked, backbone.layer3.5.conv1.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.bn1.num_batches_tracked, backbone.layer3.5.conv2_offset.weight, backbone.layer3.5.conv2_offset.bias, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.bn2.num_batches_tracked, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer3.5.bn3.num_batches_tracked, backbone.layer4.0.conv1.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.bn1.num_batches_tracked, backbone.layer4.0.conv2_offset.weight, backbone.layer4.0.conv2_offset.bias, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.bn2.num_batches_tracked, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.bn3.num_batches_tracked, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.0.downsample.1.num_batches_tracked, backbone.layer4.1.conv1.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.bn1.num_batches_tracked, backbone.layer4.1.conv2_offset.weight, backbone.layer4.1.conv2_offset.bias, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.bn2.num_batches_tracked, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.1.bn3.num_batches_tracked, backbone.layer4.2.conv1.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.bn1.num_batches_tracked, backbone.layer4.2.conv2_offset.weight, backbone.layer4.2.conv2_offset.bias, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.bn2.num_batches_tracked, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var, backbone.layer4.2.bn3.num_batches_tracked, neck.lateral_convs.0.conv.weight, neck.lateral_convs.0.conv.bias, neck.lateral_convs.1.conv.weight, neck.lateral_convs.1.conv.bias, neck.lateral_convs.2.conv.weight, neck.lateral_convs.2.conv.bias, neck.lateral_convs.3.conv.weight, neck.lateral_convs.3.conv.bias, neck.fpn_convs.0.conv.weight, neck.fpn_convs.0.conv.bias, neck.fpn_convs.1.conv.weight, neck.fpn_convs.1.conv.bias, neck.fpn_convs.2.conv.weight, neck.fpn_convs.2.conv.bias, neck.fpn_convs.3.conv.weight, neck.fpn_convs.3.conv.bias, rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias, bbox_head.0.fc_cls.weight, bbox_head.0.fc_cls.bias, bbox_head.0.fc_reg.weight, bbox_head.0.fc_reg.bias, bbox_head.0.shared_fcs.0.weight, bbox_head.0.shared_fcs.0.bias, bbox_head.0.shared_fcs.1.weight, bbox_head.0.shared_fcs.1.bias, bbox_head.1.fc_cls.weight, bbox_head.1.fc_cls.bias, bbox_head.1.fc_reg.weight, bbox_head.1.fc_reg.bias, bbox_head.1.shared_fcs.0.weight, bbox_head.1.shared_fcs.0.bias, bbox_head.1.shared_fcs.1.weight, bbox_head.1.shared_fcs.1.bias, bbox_head.2.fc_cls.weight, bbox_head.2.fc_cls.bias, bbox_head.2.fc_reg.weight, bbox_head.2.fc_reg.bias, bbox_head.2.shared_fcs.0.weight, bbox_head.2.shared_fcs.0.bias, bbox_head.2.shared_fcs.1.weight, bbox_head.2.shared_fcs.1.bias

missing keys in source state_dict: layer2.2.bn1.running_mean, layer3.0.downsample.0.weight, layer1.2.bn2.running_var, layer4.2.bn1.num_batches_tracked, layer3.0.bn2.weight, layer2.2.conv3.weight, layer3.1.conv3.weight, layer3.4.conv2.weight, layer2.3.bn2.running_mean, layer3.2.conv2_offset.bias, layer4.1.bn3.weight, layer3.3.bn1.running_mean, layer2.3.conv2_offset.weight, layer3.4.bn2.running_mean, layer4.1.bn3.running_var, layer2.1.bn2.num_batches_tracked, layer4.2.bn3.weight, layer1.2.conv3.weight, layer2.2.bn1.weight, layer2.2.bn2.running_var, layer2.3.bn3.weight, layer4.1.bn2.bias, layer2.0.bn1.num_batches_tracked, layer4.0.bn2.weight, layer4.0.bn3.num_batches_tracked, layer3.5.bn1.running_mean, layer4.0.bn1.weight, layer3.0.bn1.bias, layer1.0.bn2.running_mean, layer3.0.conv3.weight, layer1.2.bn2.num_batches_tracked, layer3.3.bn1.bias, layer3.4.bn3.running_mean, layer3.3.bn1.num_batches_tracked, layer2.3.conv2.weight, layer3.4.bn1.running_var, layer3.2.conv1.weight, layer3.5.conv2_offset.weight, layer2.2.bn3.running_var, layer3.5.bn3.running_mean, layer1.2.bn2.running_mean, layer3.5.bn1.bias, layer3.2.bn2.weight, layer2.2.bn1.num_batches_tracked, layer2.0.conv2.weight, layer4.1.conv2.weight, layer1.2.bn2.bias, layer3.2.bn3.num_batches_tracked, layer3.3.bn3.running_var, layer4.2.bn1.running_mean, layer1.2.bn2.weight, bn1.running_mean, layer1.0.bn2.num_batches_tracked, layer4.0.bn2.bias, layer4.2.bn2.num_batches_tracked, layer3.1.bn3.running_var, layer3.1.bn2.running_var, layer3.4.bn3.num_batches_tracked, layer3.2.bn1.running_var, layer2.1.conv3.weight, layer2.0.bn2.num_batches_tracked, layer4.0.bn1.running_mean, layer1.1.bn1.num_batches_tracked, layer3.0.bn1.weight, layer2.3.conv2_offset.bias, layer2.3.bn2.bias, layer3.0.bn1.running_var, layer2.0.bn3.num_batches_tracked, layer1.0.bn2.bias, layer4.2.bn2.running_var, layer1.1.bn3.num_batches_tracked, layer3.1.bn2.bias, layer1.2.bn3.weight, layer4.1.bn3.bias, layer3.1.bn3.running_mean, layer3.5.conv3.weight, layer3.0.conv1.weight, layer1.2.bn3.bias, layer1.0.downsample.1.bias, layer1.0.conv1.weight, layer1.1.bn2.num_batches_tracked, layer3.5.bn2.bias, layer3.5.bn2.num_batches_tracked, layer3.0.downsample.1.running_var, layer1.2.bn1.weight, layer4.0.bn3.weight, layer1.1.bn3.running_mean, layer3.4.conv2_offset.weight, layer1.1.bn1.running_var, layer2.3.conv3.weight, layer3.2.bn2.running_var, layer1.2.bn3.running_mean, layer4.1.conv3.weight, layer2.0.conv1.weight, layer3.5.bn1.num_batches_tracked, layer3.4.bn1.weight, layer3.0.bn3.running_var, layer1.0.bn3.running_var, layer3.2.bn3.running_mean, layer1.2.bn1.running_var, layer1.0.bn3.weight, layer3.0.conv2_offset.bias, layer2.0.conv3.weight, layer2.1.bn3.bias, layer3.0.downsample.1.running_mean, layer4.2.bn3.running_var, layer4.1.bn1.running_mean, layer3.5.bn2.running_mean, layer1.1.conv2.weight, layer3.0.conv2.weight, layer4.1.bn1.running_var, layer3.4.bn2.bias, layer3.4.bn2.num_batches_tracked, layer2.0.downsample.1.running_var, layer3.5.bn2.running_var, layer4.1.bn2.running_mean, layer2.3.bn2.num_batches_tracked, layer3.1.bn3.weight, layer3.0.bn2.num_batches_tracked, layer3.3.bn3.running_mean, layer3.2.bn3.running_var, layer4.0.conv2_offset.bias, layer2.2.conv1.weight, layer2.2.bn1.running_var, layer2.2.bn3.running_mean, layer3.4.bn3.bias, layer2.3.bn2.running_var, layer4.0.bn3.bias, layer2.1.bn3.num_batches_tracked, layer3.3.conv3.weight, layer3.4.bn1.num_batches_tracked, layer3.3.bn3.weight, layer2.2.bn3.num_batches_tracked, layer4.1.bn1.bias, layer3.2.bn2.running_mean, layer3.3.conv2.weight, layer2.0.bn2.bias, layer3.1.conv2_offset.weight, conv1.weight, layer2.2.bn2.running_mean, layer3.1.bn2.weight, layer4.1.conv1.weight, layer2.3.bn1.weight, layer1.1.bn1.bias, layer1.0.bn1.bias, layer2.0.bn2.running_mean, layer2.1.bn2.running_mean, layer3.0.bn3.num_batches_tracked, layer1.0.conv2.weight, layer3.5.conv2.weight, layer3.3.conv2_offset.weight, layer4.1.bn2.num_batches_tracked, layer3.2.bn3.bias, layer1.2.bn1.bias, layer2.3.bn1.bias, layer3.2.bn1.weight, layer1.1.bn2.bias, layer2.3.conv1.weight, layer3.4.bn2.weight, layer1.0.bn2.running_var, layer2.0.downsample.0.weight, layer3.0.downsample.1.weight, layer4.0.conv2_offset.weight, layer4.1.conv2_offset.bias, layer2.3.bn1.num_batches_tracked, layer4.2.conv2_offset.bias, layer4.2.bn1.running_var, layer2.0.bn1.weight, layer1.2.bn3.num_batches_tracked, layer1.1.bn2.running_mean, layer1.0.downsample.1.running_var, layer3.2.bn1.running_mean, layer2.0.bn2.running_var, layer3.3.conv2_offset.bias, layer3.2.bn3.weight, layer4.0.bn1.num_batches_tracked, layer3.0.downsample.1.num_batches_tracked, bn1.bias, layer3.1.conv2.weight, layer2.3.bn3.running_var, layer2.2.bn3.weight, layer2.0.downsample.1.weight, layer2.2.conv2.weight, layer4.1.bn1.weight, layer1.0.bn3.num_batches_tracked, layer3.0.bn1.running_mean, layer3.1.bn1.num_batches_tracked, layer4.0.downsample.1.running_var, layer3.1.bn1.running_mean, layer1.1.bn3.running_var, layer4.0.downsample.0.weight, layer4.2.bn1.weight, layer4.2.conv1.weight, layer2.1.bn1.num_batches_tracked, layer3.2.bn1.bias, layer3.0.bn2.bias, layer1.1.bn3.bias, layer1.0.bn3.bias, layer4.1.conv2_offset.weight, layer1.0.bn1.running_var, layer1.1.bn3.weight, layer4.0.bn3.running_mean, layer3.5.bn3.bias, layer3.2.bn2.bias, layer3.2.conv2.weight, layer2.3.bn2.weight, layer3.0.bn2.running_var, layer3.5.conv2_offset.bias, bn1.num_batches_tracked, layer3.0.bn3.weight, layer2.2.conv2_offset.bias, layer4.0.bn1.running_var, layer2.0.bn3.running_var, layer3.3.bn2.running_mean, layer4.0.bn2.running_mean, layer4.0.downsample.1.weight, layer4.2.bn1.bias, layer1.0.bn1.num_batches_tracked, layer3.4.bn3.weight, layer2.2.bn2.num_batches_tracked, layer2.1.conv2_offset.bias, layer1.0.bn2.weight, layer1.2.conv1.weight, layer2.2.bn3.bias, layer3.2.bn1.num_batches_tracked, layer1.0.downsample.0.weight, layer4.0.downsample.1.bias, layer1.0.downsample.1.running_mean, layer3.4.conv2_offset.bias, layer2.0.bn1.bias, layer3.1.conv2_offset.bias, layer2.0.bn1.running_var, layer4.1.bn3.running_mean, layer3.3.bn2.bias, layer4.2.bn3.running_mean, layer4.0.conv1.weight, layer3.3.bn1.running_var, layer4.1.bn1.num_batches_tracked, layer4.2.bn2.weight, layer1.0.downsample.1.num_batches_tracked, layer3.2.conv3.weight, layer1.0.downsample.1.weight, layer2.1.bn3.running_mean, layer3.5.bn3.num_batches_tracked, layer1.1.bn2.running_var, layer3.5.bn3.weight, layer2.3.bn3.bias, layer2.3.bn1.running_mean, layer2.0.bn2.weight, layer4.0.downsample.1.running_mean, bn1.weight, layer4.2.bn3.bias, layer1.0.bn1.running_mean, layer4.2.conv2_offset.weight, layer3.3.bn2.num_batches_tracked, layer1.0.bn1.weight, layer3.1.bn1.running_var, layer2.1.bn1.weight, layer3.4.conv1.weight, layer1.1.bn2.weight, layer4.1.bn3.num_batches_tracked, layer3.0.bn1.num_batches_tracked, layer2.0.conv2_offset.bias, bn1.running_var, layer4.0.bn2.num_batches_tracked, layer4.2.conv3.weight, layer4.2.bn2.running_mean, layer3.0.bn2.running_mean, layer2.1.bn2.bias, layer3.5.conv1.weight, layer2.1.conv2.weight, layer3.4.bn1.running_mean, layer1.1.conv3.weight, layer3.3.conv1.weight, layer2.0.bn1.running_mean, layer2.1.bn3.weight, layer2.2.conv2_offset.weight, layer3.1.bn1.weight, layer2.1.bn1.bias, layer2.1.bn2.running_var, layer4.2.bn3.num_batches_tracked, layer2.2.bn2.bias, layer3.1.bn2.num_batches_tracked, layer4.0.bn2.running_var, layer2.1.bn2.weight, layer2.0.downsample.1.running_mean, layer2.2.bn2.weight, layer3.4.bn3.running_var, layer3.5.bn1.running_var, layer3.1.bn3.bias, layer4.0.bn3.running_var, layer2.1.bn1.running_var, layer3.2.bn2.num_batches_tracked, layer2.1.bn3.running_var, layer4.1.bn2.running_var, layer4.0.conv3.weight, layer3.4.bn2.running_var, layer3.0.downsample.1.bias, layer4.0.bn1.bias, layer2.3.bn3.running_mean, layer4.0.downsample.1.num_batches_tracked, layer2.0.downsample.1.num_batches_tracked, layer3.5.bn2.weight, layer2.0.bn3.running_mean, layer1.2.bn3.running_var, layer3.1.conv1.weight, layer1.1.bn1.weight, layer3.0.conv2_offset.weight, layer4.1.bn2.weight, layer1.0.conv3.weight, layer3.3.bn2.running_var, layer3.3.bn3.bias, layer1.2.bn1.num_batches_tracked, layer1.2.conv2.weight, layer3.3.bn2.weight, layer4.0.conv2.weight, layer2.1.bn1.running_mean, layer3.2.conv2_offset.weight, layer3.3.bn1.weight, layer2.1.conv1.weight, layer1.2.bn1.running_mean, layer2.0.bn3.bias, layer4.2.bn2.bias, layer1.0.bn3.running_mean, layer4.2.conv2.weight, layer2.3.bn1.running_var, layer2.2.bn1.bias, layer2.0.bn3.weight, layer3.1.bn1.bias, layer3.5.bn1.weight, layer3.0.bn3.bias, layer3.1.bn2.running_mean, layer3.1.bn3.num_batches_tracked, layer2.3.bn3.num_batches_tracked, layer3.3.bn3.num_batches_tracked, layer3.4.conv3.weight, layer2.0.downsample.1.bias, layer3.4.bn1.bias, layer3.0.bn3.running_mean, layer1.1.conv1.weight, layer3.5.bn3.running_var, layer1.1.bn1.running_mean, layer2.1.conv2_offset.weight, layer2.0.conv2_offset.weight

loading annotations into memory... Done (t=0.02s) creating index... index created! Traceback (most recent call last): File "tools/train.py", line 90, in main() File "tools/train.py", line 86, in main logger=logger) File "/home/wangqing/heheda/openmmlab/mmdetection/mmdet/apis/train.py", line 59, in train_detector _non_dist_train(model, dataset, cfg, validate=validate) File "/home/wangqing/heheda/openmmlab/mmdetection/mmdet/apis/train.py", line 110, in _non_dist_train model = MMDataParallel(model, device_ids=range(cfg.gpus)).cuda() File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 260, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply module._apply(fn) File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply module._apply(fn) File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply module._apply(fn) File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply param.data = fn(param.data) File "/home/wangqing/miniconda3/envs/openmmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 260, in return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: unknown error

ubuntu: 16.04 gpu: NVIDIA GeForce GTX 1080 Ti x4 cuda: 9.0 cudnn: 7.4 pytorch: 1.0 gcc: 5.4

Does anyone know how to solve this problem? Thanks in advance!

hellock commented 5 years ago

You made modifications on the backbone so that there are lots of unused keys? The model and checkpoint does not match.

coolKeen commented 5 years ago

@hellock Thanks for respond! I only change the pretrained path and num_classes of the config becasue I want to train the model on other dataset. The config (cascade_rcnn_dconv_c3-c5_r50_fpn_1x.py) is shown as below:

model = dict( type='CascadeRCNN', num_stages=3, pretrained='/home/wangqing/heheda/openmmlab/mmdetection/pretrained_models/cascade_rcnn_dconv_c3-c5_r50_fpn_1x_20190125-dfa53166.pth', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, style='pytorch', dcn=dict( modulated=False, deformable_groups=1, fallback_on_stride=False), stage_with_dcn=(False, True, True, True)), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_scales=[8], anchor_ratios=[0.5, 1.0, 2.0], anchor_strides=[4, 8, 16, 32, 64], target_means=[.0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0], use_sigmoid_cls=True), bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=[ dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=6, target_means=[0., 0., 0., 0.], target_stds=[0.1, 0.1, 0.2, 0.2], reg_class_agnostic=True), dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=6, target_means=[0., 0., 0., 0.], target_stds=[0.05, 0.05, 0.1, 0.1], reg_class_agnostic=True), dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=6, target_means=[0., 0., 0., 0.], target_stds=[0.033, 0.033, 0.067, 0.067], reg_class_agnostic=True) ])

train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, smoothl1_beta=1 / 9.0, debug=False), rcnn=[ dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.7, min_pos_iou=0.7, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False) ], stage_loss_weights=[1, 0.5, 0.25]) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_num=2000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100), keep_all_stages=False)

dataset_type = 'CocoDataset' data_root = 'data/coco/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) data = dict( imgs_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_train2017.json', img_prefix=data_root + 'train2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0.5, with_mask=False, with_crowd=True, with_label=True), val=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_val2017.json', img_prefix=data_root + 'val2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0, with_mask=False, with_crowd=True, with_label=True), test=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_val2017.json', img_prefix=data_root + 'val2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0, with_mask=False, with_label=False, test_mode=True))

optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=1.0 / 3, step=[8, 11]) checkpoint_config = dict(interval=1)

log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')

])

total_epochs = 12 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/cascade_rcnn_dconv_c3-c5_r50_fpn_1x' load_from = None resume_from = None workflow = [('train', 1)]

I change the pretrained path to the absolute path of pretrained model downloaded from the model zoo. image

Firyuza commented 4 years ago

Hi! I also haven't done any modifications in the backbone part of the model, but got the same error message:

2019-11-26 17:22:17,397 - WARNING - unexpected key in source state_dict: backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.bn2.num_batches_tracked, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.1.bn3.num_batches_tracked, backbone.layer2.2.conv1.weight, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.bn1.num_batches_tracked, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.bn2.num_batches_tracked, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.2.bn3.num_batches_tracked, backbone.layer2.3.conv1.weight, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.bn1.num_batches_tracked, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.bn2.num_batches_tracked, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer2.3.bn3.num_batches_tracked, backbone.layer3.0.conv1.weight, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.bn1.num_batches_tracked, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.bn2.num_batches_tracked, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.bn3.num_batches_tracked, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.0.downsample.1.num_batches_tracked, backbone.layer3.1.conv1.weight, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.bn1.num_batches_tracked, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.bn2.num_batches_tracked, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.1.bn3.num_batches_tracked, backbone.layer3.2.conv1.weight, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.bn1.num_batches_tracked, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.bn2.num_batches_tracked, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.2.bn3.num_batches_tracked, backbone.layer3.3.conv1.weight, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.bn1.num_batches_tracked, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.bn2.num_batches_tracked, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.3.bn3.num_batches_tracked, backbone.layer3.4.conv1.weight, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.bn1.num_batches_tracked, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.bn2.num_batches_tracked, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.4.bn3.num_batches_tracked, backbone.layer3.5.conv1.weight, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.bn1.num_batches_tracked, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.bn2.num_batches_tracked, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer3.5.bn3.num_batches_tracked, backbone.layer4.0.conv1.weight, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.bn1.num_batches_tracked, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.bn2.num_batches_tracked, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.bn3.num_batches_tracked, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.0.downsample.1.num_batches_tracked, backbone.layer4.1.conv1.weight, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.bn1.num_batches_tracked, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.bn2.num_batches_tracked, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.1.bn3.num_batches_tracked, backbone.layer4.2.conv1.weight, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.bn1.num_batches_tracked, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.bn2.num_batches_tracked, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var, backbone.layer4.2.bn3.num_batches_tracked, neck.lateral_convs.0.conv.weight, neck.lateral_convs.0.conv.bias, neck.lateral_convs.1.conv.weight, neck.lateral_convs.1.conv.bias, neck.lateral_convs.2.conv.weight, neck.lateral_convs.2.conv.bias, neck.lateral_convs.3.conv.weight, neck.lateral_convs.3.conv.bias, neck.fpn_convs.0.conv.weight, neck.fpn_convs.0.conv.bias, neck.fpn_convs.1.conv.weight, neck.fpn_convs.1.conv.bias, neck.fpn_convs.2.conv.weight, neck.fpn_convs.2.conv.bias, neck.fpn_convs.3.conv.weight, neck.fpn_convs.3.conv.bias, rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias, bbox_head.fc_cls.weight, bbox_head.fc_cls.bias, bbox_head.fc_reg.weight, bbox_head.fc_reg.bias, bbox_head.shared_fcs.0.weight, bbox_head.shared_fcs.0.bias, bbox_head.shared_fcs.1.weight, bbox_head.shared_fcs.1.bias, mask_head.convs.0.conv.weight, mask_head.convs.0.conv.bias, mask_head.convs.1.conv.weight, mask_head.convs.1.conv.bias, mask_head.convs.2.conv.weight, mask_head.convs.2.conv.bias, mask_head.convs.3.conv.weight, mask_head.convs.3.conv.bias, mask_head.upsample.weight, mask_head.upsample.bias, mask_head.conv_logits.weight, mask_head.conv_logits.bias

missing keys in source state_dict: layer1.2.bn3.running_var, layer1.2.bn3.weight, layer3.2.bn2.weight, layer3.1.conv1.weight, layer1.1.bn2.running_mean, layer3.0.downsample.1.bias, layer4.0.bn1.bias, layer2.2.bn1.weight, layer2.0.downsample.1.bias, layer2.0.bn2.running_mean, layer3.0.bn2.weight, layer1.2.conv2.weight, layer3.0.bn3.weight, layer2.0.bn3.weight, layer3.2.bn2.num_batches_tracked, layer3.5.bn3.running_mean, layer3.2.bn1.running_var, layer2.1.bn2.bias, layer4.0.downsample.1.weight, layer4.0.downsample.0.weight, layer3.0.downsample.1.weight, layer2.0.bn1.num_batches_tracked, layer4.0.bn3.bias, layer4.1.bn2.bias, layer2.0.bn2.bias, layer3.4.bn1.bias, layer3.0.bn1.bias, layer3.0.downsample.1.num_batches_tracked, layer1.0.bn2.running_mean, layer3.3.bn3.running_mean, layer4.0.bn3.weight, layer2.1.conv1.weight, layer3.1.bn1.num_batches_tracked, layer4.2.bn2.running_mean, layer2.0.downsample.0.weight, layer3.5.bn3.running_var, layer1.1.bn3.bias, layer2.0.bn3.num_batches_tracked, layer3.0.downsample.0.weight, layer2.3.bn2.num_batches_tracked, layer4.0.bn3.num_batches_tracked, layer3.3.bn3.running_var, layer2.1.conv2.weight, layer2.2.bn2.running_var, layer4.1.bn2.running_var, layer3.1.conv2.weight, layer3.5.bn3.bias, layer3.5.bn2.running_var, layer2.0.downsample.1.num_batches_tracked, layer1.0.downsample.1.bias, layer3.5.conv3.weight, layer1.2.conv1.weight, layer3.0.bn1.num_batches_tracked, layer2.3.bn1.num_batches_tracked, layer1.1.bn2.bias, layer4.1.bn3.weight, layer1.0.bn2.running_var, layer4.0.bn2.weight, layer2.1.bn2.weight, layer3.4.bn2.running_var, layer1.0.bn3.running_mean, layer1.0.conv1.weight, layer3.5.bn1.weight, layer1.0.bn3.running_var, layer2.0.downsample.1.running_mean, layer3.4.conv1.weight, layer2.0.downsample.1.running_var, layer3.0.bn2.num_batches_tracked, bn1.running_mean, layer1.0.conv2.weight, layer4.0.conv1.weight, layer4.0.bn2.num_batches_tracked, layer2.1.bn3.num_batches_tracked, layer1.1.conv2.weight, layer3.5.bn3.weight, layer3.5.bn1.running_var, layer1.2.bn1.weight, layer4.0.downsample.1.running_var, layer1.0.bn1.running_var, layer4.1.bn1.running_var, layer2.3.bn1.bias, layer4.1.bn3.bias, layer4.0.bn2.running_mean, layer4.2.bn2.num_batches_tracked, layer4.2.bn3.running_mean, layer3.1.bn2.running_mean, layer3.2.bn2.running_mean, layer3.1.bn2.bias, layer3.5.conv1.weight, layer3.5.bn2.bias, layer1.2.bn1.running_mean, layer2.3.conv1.weight, layer3.1.bn2.num_batches_tracked, layer2.2.bn3.weight, layer2.0.bn1.running_var, layer3.4.bn1.num_batches_tracked, layer3.2.bn2.running_var, layer2.2.conv2.weight, layer3.0.bn2.bias, layer3.2.bn3.num_batches_tracked, layer2.1.bn2.running_var, layer4.1.bn2.running_mean, layer2.3.bn3.running_var, layer2.1.conv3.weight, layer3.1.bn3.bias, layer3.2.conv1.weight, layer3.0.bn1.running_var, layer1.2.bn2.weight, layer1.0.downsample.1.running_mean, layer2.1.bn1.num_batches_tracked, layer3.4.conv3.weight, layer1.2.bn2.running_mean, layer3.4.conv2.weight, layer3.3.conv1.weight, layer3.2.bn1.weight, layer4.0.bn1.running_var, layer2.1.bn2.num_batches_tracked, layer3.3.bn2.running_mean, layer2.0.bn3.running_var, layer4.1.bn3.num_batches_tracked, layer2.0.conv2.weight, layer3.0.bn2.running_var, layer2.2.bn1.running_mean, layer3.2.conv3.weight, layer3.5.bn2.weight, layer2.1.bn3.running_var, layer3.0.bn3.running_var, layer2.0.bn3.bias, layer4.0.downsample.1.num_batches_tracked, layer3.0.conv2.weight, layer1.0.bn2.num_batches_tracked, layer3.5.bn1.bias, layer4.1.bn1.bias, layer4.2.bn2.weight, layer3.0.downsample.1.running_var, layer2.0.conv1.weight, layer2.1.bn3.weight, layer3.1.bn1.running_mean, layer3.5.bn2.running_mean, layer3.3.bn3.weight, layer3.3.conv2.weight, layer3.5.bn3.num_batches_tracked, bn1.weight, layer1.2.bn3.running_mean, layer1.1.conv3.weight, bn1.bias, layer3.0.downsample.1.running_mean, layer1.0.bn3.weight, layer2.3.bn1.running_var, layer1.2.bn1.bias, layer2.2.bn2.running_mean, layer3.3.bn1.running_var, layer1.1.bn3.weight, layer1.1.bn1.num_batches_tracked, layer3.0.bn2.running_mean, layer2.0.bn2.running_var, layer2.0.bn1.running_mean, layer3.1.bn3.running_var, layer3.3.bn2.weight, layer2.0.bn2.num_batches_tracked, layer3.2.bn3.bias, layer2.1.bn1.bias, layer1.0.bn1.weight, layer3.4.bn3.num_batches_tracked, layer4.1.bn1.weight, layer4.2.bn3.bias, layer1.1.bn1.weight, layer3.4.bn1.running_mean, layer3.3.bn1.num_batches_tracked, layer2.3.bn2.weight, layer2.3.bn3.running_mean, layer3.1.bn3.num_batches_tracked, layer4.0.bn1.num_batches_tracked, layer2.2.bn1.num_batches_tracked, layer4.0.downsample.1.running_mean, layer2.0.conv3.weight, layer3.4.bn2.running_mean, layer3.4.bn3.bias, layer2.1.bn1.running_mean, layer3.3.bn2.num_batches_tracked, layer2.1.bn3.bias, layer3.0.bn3.bias, layer3.4.bn2.weight, layer3.1.bn3.running_mean, layer4.2.bn1.bias, layer3.4.bn2.num_batches_tracked, layer4.0.bn1.weight, layer3.2.bn3.weight, layer1.1.bn3.running_mean, layer2.1.bn1.running_var, layer3.3.bn2.bias, layer2.2.bn3.num_batches_tracked, layer4.2.bn1.num_batches_tracked, layer4.0.bn3.running_var, layer3.0.bn1.weight, layer3.5.bn1.num_batches_tracked, layer2.1.bn1.weight, layer4.1.conv3.weight, layer3.4.bn3.running_mean, layer4.1.bn3.running_mean, layer2.1.bn3.running_mean, layer3.2.conv2.weight, layer3.0.bn3.num_batches_tracked, layer3.4.bn1.weight, layer2.2.bn2.bias, layer2.2.bn2.num_batches_tracked, bn1.running_var, layer3.3.bn3.bias, layer1.1.conv1.weight, layer1.0.downsample.1.num_batches_tracked, layer4.0.conv3.weight, layer1.0.bn1.num_batches_tracked, layer3.1.bn2.running_var, layer1.1.bn2.running_var, layer2.2.bn1.bias, layer2.3.conv3.weight, layer2.0.bn1.weight, layer3.3.bn1.running_mean, layer1.2.bn1.running_var, layer4.2.bn2.running_var, layer1.0.bn2.weight, layer3.4.bn3.running_var, layer4.1.bn1.num_batches_tracked, layer3.5.bn2.num_batches_tracked, layer3.3.bn1.weight, layer1.1.bn1.bias, layer2.2.conv3.weight, layer3.0.conv3.weight, layer3.2.bn1.num_batches_tracked, conv1.weight, layer2.0.downsample.1.weight, layer2.2.bn3.running_mean, layer2.2.conv1.weight, layer2.2.bn2.weight, layer3.3.conv3.weight, layer4.2.bn3.running_var, layer3.1.bn1.bias, layer3.5.conv2.weight, layer3.1.bn2.weight, layer1.2.bn3.bias, layer1.1.bn3.running_var, layer3.0.bn3.running_mean, layer4.0.bn2.running_var, layer4.0.downsample.1.bias, layer1.2.conv3.weight, layer4.2.bn1.running_mean, layer1.0.downsample.1.running_var, layer4.2.bn3.num_batches_tracked, layer3.4.bn2.bias, layer1.0.bn2.bias, layer2.0.bn3.running_mean, layer2.3.bn2.running_var, layer3.4.bn3.weight, layer4.0.bn2.bias, layer3.3.bn3.num_batches_tracked, layer4.2.conv2.weight, layer2.2.bn1.running_var, layer4.2.conv3.weight, layer2.3.bn3.num_batches_tracked, layer1.1.bn2.weight, layer4.0.bn1.running_mean, layer1.2.bn1.num_batches_tracked, layer4.1.bn1.running_mean, layer3.1.bn1.running_var, layer4.2.bn1.weight, layer1.1.bn1.running_var, layer3.3.bn2.running_var, layer4.0.bn3.running_mean, layer1.1.bn1.running_mean, layer4.0.conv2.weight, layer1.2.bn3.num_batches_tracked, layer3.2.bn1.bias, layer3.1.bn1.weight, layer1.0.downsample.0.weight, layer3.2.bn2.bias, layer2.2.bn3.running_var, layer3.1.conv3.weight, layer1.2.bn2.running_var, layer2.3.bn1.running_mean, layer2.0.bn2.weight, layer2.3.bn1.weight, layer3.2.bn3.running_var, layer3.0.bn1.running_mean, bn1.num_batches_tracked, layer1.0.conv3.weight, layer2.1.bn2.running_mean, layer4.2.bn1.running_var, layer4.2.bn3.weight, layer2.3.bn3.bias, layer2.3.bn3.weight, layer1.0.bn1.running_mean, layer2.3.bn2.running_mean, layer4.1.bn2.num_batches_tracked, layer1.0.bn3.num_batches_tracked, layer1.1.bn3.num_batches_tracked, layer4.2.bn2.bias, layer4.1.bn2.weight, layer4.1.conv1.weight, layer4.1.bn3.running_var, layer3.2.bn1.running_mean, layer1.0.bn1.bias, layer1.0.bn3.bias, layer2.0.bn1.bias, layer3.2.bn3.running_mean, layer3.4.bn1.running_var, layer4.1.conv2.weight, layer3.3.bn1.bias, layer2.3.conv2.weight, layer1.1.bn2.num_batches_tracked, layer1.2.bn2.bias, layer1.2.bn2.num_batches_tracked, layer3.5.bn1.running_mean, layer4.2.conv1.weight, layer3.0.conv1.weight, layer2.2.bn3.bias, layer2.3.bn2.bias, layer1.0.downsample.1.weight, layer3.1.bn3.weight

May you know how to restore in a custom way? Thanks!

ZwwWayne commented 4 years ago

This might because the checkpoint weights are CUDA tensor and are loaded to GPU/different device (e.g., device 0), but your model currently is still in CPU/different device (e.g., device 1), you could check this first.

Firyuza commented 4 years ago

Hi @ZwwWayne ! I checked it out and got True value: print(next(model.parameters()).is_cuda). I suppose the model is already on cuda device...

Thanks!

ZwwWayne commented 4 years ago

So the model is on the CUDA device, how about the checkpoints and their devices? The devices are also important and need check. Usually, the checkpoints are saved using the CUDA tensor on device 0, which might not be able to be loaded by the models on the CUDA device 1.

Firyuza commented 4 years ago

Thank you for feedback!

I found out that my backbone model parameters do not have 'backbone' as a prefix in its name. I created my own TwoStage class where I use backbone and Mask R-CNN, maybe you know how could I loose this prefix?

Thanks!