HorizonRobotics / Sparse4D

MIT License
315 stars 29 forks source link

When will the configuration of other backbones in the nuscenes dataset be open? #22

Closed zhaoyangwei123 closed 7 months ago

zhaoyangwei123 commented 7 months ago

HI linxuewu, Thanks for sharing the great work. Could you please share the nuscenes config file based on R101 and VoV99 with me (1578174273@qq.com)? Thanks again.

linxuewu commented 7 months ago

https://github.com/HorizonRobotics/Sparse4D/issues/18

zhaoyangwei123 commented 7 months ago

您好,对R101如果设置batch_size=2, num_gpus=8,那么total batch size就从R50的48变成16是吗?https://github.com/HorizonRobotics/Sparse4D/issues/12 我的配置如下:total_batch_size = 16 batch_size=2, num_gpus=8, num_iters_per_epoch = int(28130 // (num_gpus * batch_size)),lr=3e-4, epoch=80, backbone_lr_mult=0.1,想问一下这么配置的话和您所说的是一样的吗?

linxuewu commented 7 months ago

yes

您好,对R101如果设置batch_size=2, num_gpus=8,那么total batch size就从R50的48变成16是吗?https://github.com/HorizonRobotics/Sparse4D/issues/12 我的配置如下:total_batch_size = 16 batch_size=2, num_gpus=8, num_iters_per_epoch = int(28130 // (num_gpus * batch_size)),lr=3e-4, epoch=80, backbone_lr_mult=0.1,想问一下这么配置的话和您所说的是一样的吗?

zhaoyangwei123 commented 7 months ago

您好,我遵循了batch_size=2的配置,但是我在第一个epoch的第一轮训练时发现grad_norm为nan,但是后续训练的时候grad_norm是一个正常的数字,但是发现模型在一开始总loss会下降一段时间,然后总loss就开始不下降,第一次评测的结果MAP只有6点多,这是否与一开始grad_norm为nan有关呢?: lr: 6.000e-06, eta: 3 days, 15:55:40, time: 1.126, data_time: 0.073, memory: 11019, loss_cls_0: 1.9477, loss_box_0: 0.6963, loss_cns_0: 0.1432, loss_yns_0: 0.0461, loss_cls_1: 1.9088, loss_box_1: 0.8084, loss_cns_1: 0.1057, loss_yns_1: 0.0448, loss_cls_2: 1.9613, loss_box_2: 0.4847, loss_cns_2: 0.0652, loss_yns_2: 0.0294, loss_cls_3: 1.8312, loss_box_3: 1.4169, loss_cns_3: 0.2174, loss_yns_3: 0.0750, loss_cls_4: 1.7034, loss_box_4: 1.7884, loss_cns_4: 0.2312, loss_yns_4: 0.0905, loss_cls_5: 1.8227, loss_box_5: 1.2277, loss_cns_5: 0.1802, loss_yns_5: 0.0637, loss_cls_dn_0: 0.9695, loss_box_dn_0: 1.2779, loss_cls_dn_1: 0.8814, loss_box_dn_1: 2.6075, loss_cls_dn_2: 0.9295, loss_box_dn_2: 2.8197, loss_cls_dn_3: 0.8671, loss_box_dn_3: 2.7295, loss_cls_dn_4: 0.8111, loss_box_dn_4: 2.7771, loss_cls_dn_5: 0.8038, loss_box_dn_5: 2.8359, loss_dense_depth: 2.4449, loss: 41.6449, grad_norm: nan

linxuewu commented 7 months ago

首先你这个lr怎么会显示6e-6呢,参数没设置正确吧。

linxuewu commented 7 months ago

optimizer = dict( type='AdamW', lr=0.0003, weight_decay=0.001, paramwise_cfg=dict(custom_keys=dict(img_backbone=dict(lr_mult=0.1)))) optimizer_config = dict(grad_clip=dict(max_norm=25, norm_type=2)) lr_config = dict( policy='CosineAnnealing', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, min_lr_ratio=0.001)

zhaoyangwei123 commented 7 months ago

首先你这个lr怎么会显示6e-6呢,参数没设置正确吧。

这是刚一开始的学习率,一般model train的话刚一开始的学习率会变大,后期几次迭代时候恢复到正常学习率呀

zhaoyangwei123 commented 7 months ago

这是不是因为batch size为2的原因呢,batch size为2的话一个gpu一次load 2张图吧,但是对于多目来说一次不应该load 6张图吗?

linxuewu commented 7 months ago

我warmup阶段也没出现这个学习率。而且你是在warmup阶段就nan了?

linxuewu commented 7 months ago

batch size=2是指每次两个样本,2*6=12张图。

linxuewu commented 7 months ago

提供我log的部分信息。

2023-09-18 14:53:10,052 - mmdet - INFO - Iter [51/140640] lr: 1.200e-05, eta: 1 day, 18:22:29, time: 1.085, data_time: 0.086, memory: 10654, loss_cls_0: 1.1889, loss_reg_0: 2.0397, loss_cns_0: 0.5934, loss_yns_0: 0.1641, loss_cls_1: 1.2932, loss_reg_1: 2.2867, loss_cns_1: 0.5564, loss_yns_1: 0.1598, loss_cls_2: 1.3167, loss_reg_2: 2.2618, loss_cns_2: 0.5532, loss_yns_2: 0.1596, loss_cls_3: 1.3282, loss_reg_3: 2.3412, loss_cns_3: 0.5568, loss_yns_3: 0.1647, loss_cls_4: 1.3150, loss_reg_4: 2.3478, loss_cns_4: 0.5683, loss_yns_4: 0.1675, loss_cls_5: 1.2993, loss_reg_5: 2.4744, loss_cns_5: 0.5677, loss_yns_5: 0.1711, dn_loss_cls_0: 0.5192, dn_loss_reg_0: 0.9616, dn_loss_cls_1: 0.4930, dn_loss_reg_1: 0.9932, dn_loss_cls_2: 0.5325, dn_loss_reg_2: 1.0116, dn_loss_cls_3: 0.5333, dn_loss_reg_3: 1.0440, dn_loss_cls_4: 0.5382, dn_loss_reg_4: 1.1027, dn_loss_cls_5: 0.5218, dn_loss_reg_5: 1.1429, loss_dense_depth: 1.1440, loss: 36.4135, grad_norm: 106.6352 2023-09-18 14:54:03,280 - mmdet - INFO - Iter [102/140640] lr: 1.404e-05, eta: 1 day, 17:32:40, time: 1.043, data_time: 0.049, memory: 10654, loss_cls_0: 0.9958, loss_reg_0: 1.8812, loss_cns_0: 0.6274, loss_yns_0: 0.1639, loss_cls_1: 1.0903, loss_reg_1: 2.0695, loss_cns_1: 0.6258, loss_yns_1: 0.1683, loss_cls_2: 1.1200, loss_reg_2: 2.0651, loss_cns_2: 0.6328, loss_yns_2: 0.1687, loss_cls_3: 1.1384, loss_reg_3: 2.0799, loss_cns_3: 0.6334, loss_yns_3: 0.1691, loss_cls_4: 1.1536, loss_reg_4: 2.1001, loss_cns_4: 0.6332, loss_yns_4: 0.1697, loss_cls_5: 1.1574, loss_reg_5: 2.1279, loss_cns_5: 0.6330, loss_yns_5: 0.1705, dn_loss_cls_0: 0.4105, dn_loss_reg_0: 0.8618, dn_loss_cls_1: 0.2731, dn_loss_reg_1: 0.8383, dn_loss_cls_2: 0.3069, dn_loss_reg_2: 0.8367, dn_loss_cls_3: 0.3276, dn_loss_reg_3: 0.8456, dn_loss_cls_4: 0.3404, dn_loss_reg_4: 0.8600, dn_loss_cls_5: 0.3729, dn_loss_reg_5: 0.8767, loss_dense_depth: 0.8981, loss: 31.8234, grad_norm: 82.3027 2023-09-18 14:54:56,463 - mmdet - INFO - Iter [153/140640] lr: 1.608e-05, eta: 1 day, 17:14:55, time: 1.043, data_time: 0.051, memory: 10654, loss_cls_0: 0.9368, loss_reg_0: 1.8415, loss_cns_0: 0.6241, loss_yns_0: 0.1534, loss_cls_1: 0.9952, loss_reg_1: 1.9406, loss_cns_1: 0.6440, loss_yns_1: 0.1585, loss_cls_2: 1.0219, loss_reg_2: 1.9298, loss_cns_2: 0.6499, loss_yns_2: 0.1582, loss_cls_3: 1.0184, loss_reg_3: 1.9285, loss_cns_3: 0.6501, loss_yns_3: 0.1586, loss_cls_4: 1.0313, loss_reg_4: 1.9362, loss_cns_4: 0.6514, loss_yns_4: 0.1583, loss_cls_5: 1.0438, loss_reg_5: 1.9580, loss_cns_5: 0.6506, loss_yns_5: 0.1615, dn_loss_cls_0: 0.3671, dn_loss_reg_0: 0.8250, dn_loss_cls_1: 0.1960, dn_loss_reg_1: 0.7984, dn_loss_cls_2: 0.1993, dn_loss_reg_2: 0.7929, dn_loss_cls_3: 0.2057, dn_loss_reg_3: 0.7946, dn_loss_cls_4: 0.2213, dn_loss_reg_4: 0.8024, dn_loss_cls_5: 0.2341, dn_loss_reg_5: 0.8162, loss_dense_depth: 0.8746, loss: 29.5282, grad_norm: 60.5532 2023-09-18 14:55:49,435 - mmdet - INFO - Iter [204/140640] lr: 1.812e-05, eta: 1 day, 17:03:36, time: 1.039, data_time: 0.053, memory: 10654, loss_cls_0: 0.9051, loss_reg_0: 1.7846, loss_cns_0: 0.6196, loss_yns_0: 0.1571, loss_cls_1: 0.9777, loss_reg_1: 1.7888, loss_cns_1: 0.6522, loss_yns_1: 0.1589, loss_cls_2: 1.0015, loss_reg_2: 1.7746, loss_cns_2: 0.6571, loss_yns_2: 0.1586, loss_cls_3: 0.9964, loss_reg_3: 1.7665, loss_cns_3: 0.6569, loss_yns_3: 0.1583, loss_cls_4: 0.9945, loss_reg_4: 1.7672, loss_cns_4: 0.6551, loss_yns_4: 0.1584, loss_cls_5: 1.0032, loss_reg_5: 1.7894, loss_cns_5: 0.6562, loss_yns_5: 0.1601, dn_loss_cls_0: 0.2930, dn_loss_reg_0: 0.8067, dn_loss_cls_1: 0.1529, dn_loss_reg_1: 0.7698, dn_loss_cls_2: 0.1582, dn_loss_reg_2: 0.7572, dn_loss_cls_3: 0.1587, dn_loss_reg_3: 0.7529, dn_loss_cls_4: 0.1647, dn_loss_reg_4: 0.7572, dn_loss_cls_5: 0.1770, dn_loss_reg_5: 0.7664, loss_dense_depth: 0.8156, loss: 27.9285, grad_norm: 51.3955 2023-09-18 14:56:41,997 - mmdet - INFO - Iter [255/140640] lr: 2.016e-05, eta: 1 day, 16:52:26, time: 1.031, data_time: 0.053, memory: 10654, loss_cls_0: 0.8360, loss_reg_0: 1.7014, loss_cns_0: 0.6216, loss_yns_0: 0.1544, loss_cls_1: 0.9175, loss_reg_1: 1.5747, loss_cns_1: 0.6571, loss_yns_1: 0.1542, loss_cls_2: 0.9422, loss_reg_2: 1.5706, loss_cns_2: 0.6625, loss_yns_2: 0.1549, loss_cls_3: 0.9220, loss_reg_3: 1.5549, loss_cns_3: 0.6585, loss_yns_3: 0.1525, loss_cls_4: 0.9262, loss_reg_4: 1.5594, loss_cns_4: 0.6603, loss_yns_4: 0.1543, loss_cls_5: 0.9325, loss_reg_5: 1.5734, loss_cns_5: 0.6622, loss_yns_5: 0.1545, dn_loss_cls_0: 0.2405, dn_loss_reg_0: 0.7914, dn_loss_cls_1: 0.1274, dn_loss_reg_1: 0.7455, dn_loss_cls_2: 0.1280, dn_loss_reg_2: 0.7332, dn_loss_cls_3: 0.1235, dn_loss_reg_3: 0.7275, dn_loss_cls_4: 0.1282, dn_loss_reg_4: 0.7303, dn_loss_cls_5: 0.1365, dn_loss_reg_5: 0.7373, loss_dense_depth: 0.7442, loss: 25.9514, grad_norm: 57.2754 2023-09-18 14:57:34,161 - mmdet - INFO - Iter [306/140640] lr: 2.220e-05, eta: 1 day, 16:41:39, time: 1.023, data_time: 0.048, memory: 10654, loss_cls_0: 0.7829, loss_reg_0: 1.7134, loss_cns_0: 0.6256, loss_yns_0: 0.1556, loss_cls_1: 0.8649, loss_reg_1: 1.5555, loss_cns_1: 0.6629, loss_yns_1: 0.1563, loss_cls_2: 0.8743, loss_reg_2: 1.5383, loss_cns_2: 0.6649, loss_yns_2: 0.1560, loss_cls_3: 0.8602, loss_reg_3: 1.5324, loss_cns_3: 0.6642, loss_yns_3: 0.1568, loss_cls_4: 0.8629, loss_reg_4: 1.5349, loss_cns_4: 0.6650, loss_yns_4: 0.1552, loss_cls_5: 0.8704, loss_reg_5: 1.5383, loss_cns_5: 0.6659, loss_yns_5: 0.1564, dn_loss_cls_0: 0.2215, dn_loss_reg_0: 0.7818, dn_loss_cls_1: 0.1038, dn_loss_reg_1: 0.7296, dn_loss_cls_2: 0.1001, dn_loss_reg_2: 0.7138, dn_loss_cls_3: 0.0984, dn_loss_reg_3: 0.7070, dn_loss_cls_4: 0.1013, dn_loss_reg_4: 0.7087, dn_loss_cls_5: 0.1046, dn_loss_reg_5: 0.7131, loss_dense_depth: 0.7664, loss: 25.2632, grad_norm: 50.4640 2023-09-18 14:58:26,400 - mmdet - INFO - Iter [357/140640] lr: 2.424e-05, eta: 1 day, 16:34:11, time: 1.024, data_time: 0.048, memory: 10654, loss_cls_0: 0.7691, loss_reg_0: 1.6541, loss_cns_0: 0.6322, loss_yns_0: 0.1454, loss_cls_1: 0.8514, loss_reg_1: 1.5097, loss_cns_1: 0.6659, loss_yns_1: 0.1452, loss_cls_2: 0.8538, loss_reg_2: 1.4882, loss_cns_2: 0.6669, loss_yns_2: 0.1446, loss_cls_3: 0.8482, loss_reg_3: 1.4785, loss_cns_3: 0.6664, loss_yns_3: 0.1427, loss_cls_4: 0.8456, loss_reg_4: 1.4766, loss_cns_4: 0.6654, loss_yns_4: 0.1414, loss_cls_5: 0.8534, loss_reg_5: 1.4812, loss_cns_5: 0.6656, loss_yns_5: 0.1421, dn_loss_cls_0: 0.2253, dn_loss_reg_0: 0.7679, dn_loss_cls_1: 0.1033, dn_loss_reg_1: 0.7142, dn_loss_cls_2: 0.0996, dn_loss_reg_2: 0.6998, dn_loss_cls_3: 0.0970, dn_loss_reg_3: 0.6933, dn_loss_cls_4: 0.0976, dn_loss_reg_4: 0.6948, dn_loss_cls_5: 0.1014, dn_loss_reg_5: 0.6993, loss_dense_depth: 0.7110, loss: 24.6379, grad_norm: 47.3615

zhaoyangwei123 commented 7 months ago

我warmup阶段也没出现这个学习率。而且你是在warmup阶段就nan了?

我是刚一开始第一轮迭代的时候是nan,后面第二轮就恢复正常了

zhaoyangwei123 commented 7 months ago

batch size=2是指每次两个样本,2*6=12张图。

我看config打印出来,之前r50的时候8卡是total batch size为48,gpus=8,batch size=6(即使变为4卡的话,total batch size=24 那么batch size也是为6)另外根据打印出的config来看sample per gpu是为2,在mmdetection里面这个应该代表每个gpu的batch size吧?

linxuewu commented 7 months ago

每个样本是6个视角的图像。每个gpu两个样本,共12张图。

zhaoyangwei123 commented 7 months ago

好的好的,可能是我一开始理解错了,下面是我在四卡A100上跑的,我将设置batch size设置为2,学习率由3e-4变到1.5e-4,但是跑了好几次也上下调了学习率第一轮出来的grad norm还是nan ,您看我这个config的话是对的吗?

2024-03-06 09:27:50,231 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] CUDA available: True GPU 0,1,2,3: NVIDIA A100-PCIE-40GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.6, V11.6.55 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.13.0+cu116 PyTorch compiling details: PyTorch built with:

TorchVision: 0.14.0+cu116 OpenCV: 4.9.0 MMCV: 1.7.1 MMCV Compiler: GCC 7.5 MMCV CUDA Compiler: 11.6 MMDetection: 2.28.2+ea0add8

2024-03-06 09:27:51,385 - mmdet - INFO - Distributed training: True 2024-03-06 09:27:52,441 - mmdet - INFO - Config: plugin = True plugin_dir = 'projects/mmdet3d_plugin/' dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/sparse4dv3_temporal_r101_1x8_bs6_512x1408' total_batch_size = 8 num_gpus = 4 batch_size = 2 num_iters_per_epoch = 3516 num_epochs = 80 checkpoint_epoch_interval = 5 checkpoint_config = dict(interval=17580) log_config = dict( interval=51, hooks=[ dict(type='TextLoggerHook', by_epoch=False), dict(type='TensorboardLoggerHook') ]) load_from = None resume_from = None workflow = [('train', 1)] fp16 = dict(loss_scale=32.0) input_shape = (1408, 512) tracking_test = True tracking_threshold = 0.2 class_names = [ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ] num_classes = 10 embed_dims = 256 num_groups = 8 num_decoder = 6 num_single_frame_decoder = 1 use_deformable_func = True strides = [4, 8, 16, 32] num_levels = 4 num_depth_layers = 3 drop_out = 0.1 temporal = True decouple_attn = True with_quality_estimation = True model = dict( type='Sparse4D', use_grid_mask=True, use_deformable_func=True, img_backbone=dict( type='ResNet', depth=101, num_stages=4, frozen_stages=-1, norm_eval=False, style='pytorch', with_cp=True, out_indices=(0, 1, 2, 3), norm_cfg=dict(type='BN', requires_grad=True), pretrained= 'ckpt/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth' ), img_neck=dict( type='FPN', num_outs=4, start_level=0, out_channels=256, add_extra_convs='on_output', relu_before_extra_convs=True, in_channels=[256, 512, 1024, 2048]), depth_branch=dict( type='DenseDepthNet', embed_dims=256, num_depth_layers=3, loss_weight=0.2), head=dict( type='Sparse4DHead', cls_threshold_to_reg=0.05, decouple_attn=True, instance_bank=dict( type='InstanceBank', num_anchor=900, embed_dims=256, anchor='_nuscenes_kmeans900.npy', anchor_handler=dict(type='SparseBox3DKeyPointsGenerator'), num_temp_instances=600, confidence_decay=0.6, feat_grad=False), anchor_encoder=dict( type='SparseBox3DEncoder', vel_dims=3, embed_dims=[128, 32, 32, 64], mode='cat', output_fc=False, in_loops=1, out_loops=4), num_single_frame_decoder=1, operation_order=[ 'deformable', 'ffn', 'norm', 'refine', 'temp_gnn', 'gnn', 'norm', 'deformable', 'ffn', 'norm', 'refine', 'temp_gnn', 'gnn', 'norm', 'deformable', 'ffn', 'norm', 'refine', 'temp_gnn', 'gnn', 'norm', 'deformable', 'ffn', 'norm', 'refine', 'temp_gnn', 'gnn', 'norm', 'deformable', 'ffn', 'norm', 'refine', 'temp_gnn', 'gnn', 'norm', 'deformable', 'ffn', 'norm', 'refine' ], temp_graph_model=dict( type='MultiheadAttention', embed_dims=512, num_heads=8, batch_first=True, dropout=0.1), graph_model=dict( type='MultiheadAttention', embed_dims=512, num_heads=8, batch_first=True, dropout=0.1), norm_layer=dict(type='LN', normalized_shape=256), ffn=dict( type='AsymmetricFFN', in_channels=512, pre_norm=dict(type='LN'), embed_dims=256, feedforward_channels=1024, num_fcs=2, ffn_drop=0.1, act_cfg=dict(type='ReLU', inplace=True)), deformable_model=dict( type='DeformableFeatureAggregation', embed_dims=256, num_groups=8, num_levels=4, num_cams=6, attn_drop=0.15, use_deformable_func=True, use_camera_embed=True, residual_mode='cat', kps_generator=dict( type='SparseBox3DKeyPointsGenerator', num_learnable_pts=6, fix_scale=[[0, 0, 0], [0.45, 0, 0], [-0.45, 0, 0], [0, 0.45, 0], [0, -0.45, 0], [0, 0, 0.45], [0, 0, -0.45]])), refine_layer=dict( type='SparseBox3DRefinementModule', embed_dims=256, num_cls=10, refine_yaw=True, with_quality_estimation=True), sampler=dict( type='SparseBox3DTarget', num_dn_groups=5, num_temp_dn_groups=3, dn_noise_scale=[2.0, 2.0, 2.0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], max_dn_gt=32, add_neg_dn=True, cls_weight=2.0, box_weight=0.25, reg_weights=[2.0, 2.0, 2.0, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0], cls_wise_reg_weights=dict( {9: [2.0, 2.0, 2.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0]})), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=2.0), loss_reg=dict( type='SparseBox3DLoss', loss_box=dict(type='L1Loss', loss_weight=0.25), loss_centerness=dict(type='CrossEntropyLoss', use_sigmoid=True), loss_yawness=dict(type='GaussianFocalLoss'), cls_allow_reverse=[5]), decoder=dict(type='SparseBox3DDecoder'), reg_weights=[2.0, 2.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])) dataset_type = 'NuScenes3DDetTrackDataset' data_root = 'data/nuscenes/' anno_root = 'data/nuscenes_anno_pkls/' file_client_args = dict(backend='disk') img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='LoadPointsFromFile', coord_type='LIDAR', load_dim=5, use_dim=5, file_client_args=dict(backend='disk')), dict(type='ResizeCropFlipImage'), dict(type='MultiScaleDepthMapGenerator', downsample=[4, 8, 16]), dict(type='BBoxRotation'), dict(type='PhotoMetricDistortionMultiViewImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict( type='CircleObjectRangeFilter', class_dist_thred=[55, 55, 55, 55, 55, 55, 55, 55, 55, 55]), dict( type='InstanceNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict(type='NuScenesSparse4DAdaptor'), dict( type='Collect', keys=[ 'img', 'timestamp', 'projection_mat', 'image_wh', 'gt_depth', 'focal', 'gt_bboxes_3d', 'gt_labels_3d' ], meta_keys=['T_global', 'T_global_inv', 'timestamp', 'instance_id']) ] test_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='ResizeCropFlipImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='NuScenesSparse4DAdaptor'), dict( type='Collect', keys=['img', 'timestamp', 'projection_mat', 'image_wh'], meta_keys=['T_global', 'T_global_inv', 'timestamp']) ] input_modality = dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=False) data_basic_config = dict( type='NuScenes3DDetTrackDataset', data_root='data/nuscenes/', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=False), version='v1.0-trainval') data_aug_conf = dict( resize_lim=(0.4, 0.47), final_dim=(512, 1408), bot_pct_lim=(0.0, 0.0), rot_lim=(-5.4, 5.4), H=900, W=1600, rand_flip=True, rot3d_range=[-0.3925, 0.3925]) data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='NuScenes3DDetTrackDataset', data_root='data/nuscenes/', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=False), version='v1.0-trainval', ann_file='data/nuscenes_anno_pkls/nuscenes_infos_train.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='LoadPointsFromFile', coord_type='LIDAR', load_dim=5, use_dim=5, file_client_args=dict(backend='disk')), dict(type='ResizeCropFlipImage'), dict(type='MultiScaleDepthMapGenerator', downsample=[4, 8, 16]), dict(type='BBoxRotation'), dict(type='PhotoMetricDistortionMultiViewImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict( type='CircleObjectRangeFilter', class_dist_thred=[55, 55, 55, 55, 55, 55, 55, 55, 55, 55]), dict( type='InstanceNameFilter', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ]), dict(type='NuScenesSparse4DAdaptor'), dict( type='Collect', keys=[ 'img', 'timestamp', 'projection_mat', 'image_wh', 'gt_depth', 'focal', 'gt_bboxes_3d', 'gt_labels_3d' ], meta_keys=[ 'T_global', 'T_global_inv', 'timestamp', 'instance_id' ]) ], test_mode=False, data_aug_conf=dict( resize_lim=(0.4, 0.47), final_dim=(512, 1408), bot_pct_lim=(0.0, 0.0), rot_lim=(-5.4, 5.4), H=900, W=1600, rand_flip=True, rot3d_range=[-0.3925, 0.3925]), with_seq_flag=True, sequences_split_num=2, keep_consistent_seq_aug=True), val=dict( type='NuScenes3DDetTrackDataset', data_root='data/nuscenes/', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=False), version='v1.0-trainval', ann_file='data/nuscenes_anno_pkls/nuscenes_infos_val.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='ResizeCropFlipImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='NuScenesSparse4DAdaptor'), dict( type='Collect', keys=['img', 'timestamp', 'projection_mat', 'image_wh'], meta_keys=['T_global', 'T_global_inv', 'timestamp']) ], data_aug_conf=dict( resize_lim=(0.4, 0.47), final_dim=(512, 1408), bot_pct_lim=(0.0, 0.0), rot_lim=(-5.4, 5.4), H=900, W=1600, rand_flip=True, rot3d_range=[-0.3925, 0.3925]), test_mode=True, tracking=True, tracking_threshold=0.2), test=dict( type='NuScenes3DDetTrackDataset', data_root='data/nuscenes/', classes=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], modality=dict( use_lidar=False, use_camera=True, use_radar=False, use_map=False, use_external=False), version='v1.0-trainval', ann_file='data/nuscenes_anno_pkls/nuscenes_infos_val.pkl', pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='ResizeCropFlipImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='NuScenesSparse4DAdaptor'), dict( type='Collect', keys=['img', 'timestamp', 'projection_mat', 'image_wh'], meta_keys=['T_global', 'T_global_inv', 'timestamp']) ], data_aug_conf=dict( resize_lim=(0.4, 0.47), final_dim=(512, 1408), bot_pct_lim=(0.0, 0.0), rot_lim=(-5.4, 5.4), H=900, W=1600, rand_flip=True, rot3d_range=[-0.3925, 0.3925]), test_mode=True, tracking=True, tracking_threshold=0.2)) optimizer = dict( type='AdamW', lr=0.00015, weight_decay=0.001, paramwise_cfg=dict(custom_keys=dict(img_backbone=dict(lr_mult=0.1)))) optimizer_config = dict(grad_clip=dict(max_norm=25, norm_type=2)) lr_config = dict( policy='CosineAnnealing', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, min_lr_ratio=0.001) runner = dict(type='IterBasedRunner', max_iters=281280) vis_pipeline = [ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict( type='DefaultFormatBundle3D', class_names=[ 'car', 'truck', 'construction_vehicle', 'bus', 'trailer', 'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone' ], with_label=False), dict(type='Collect3D', keys=['img'], meta_keys=['timestamp', 'lidar2img']) ] evaluation = dict( interval=17580, pipeline=[ dict(type='LoadMultiViewImageFromFiles', to_float32=True), dict(type='ResizeCropFlipImage'), dict( type='NormalizeMultiviewImage', mean=[123.675, 116.28, 103.53],e='ResNet', depth=101, with_cp=Truemestamp']) ]) gpu_ids = range(0, 4) 2024-03-06 09:28:14,589 - mmdet - INFO - workflow: [('train', 1)], max: 281280 iters 2024-03-06 09:28:14,592 - mmdet - INFO - Checkpoints will be saved to /root/wzy/Sparse4D/work_dirs/sparse4dv3_temporal_r101_1x8_bs6_512x1408 by HardDiskBackend. 2024-03-06 09:29:12,199 - mmdet - INFO - Iter [51/281280] lr: 6.000e-06, eta: 3 days, 15:55:40, time: 1.126, data_time: 0.073, memory: 11019, loss_cls_0: 1.9477, loss_box_0: 0.6963, loss_cns_0: 0.1432, loss_yns_0: 0.0461, loss_cls_1: 1.9088, loss_box_1: 0.8084, loss_cns_1: 0.1057, loss_yns_1: 0.0448, loss_cls_2: 1.9613, loss_box_2: 0.4847, loss_cns_2: 0.0652, loss_yns_2: 0.0294, loss_cls_3: 1.8312, loss_box_3: 1.4169, loss_cns_3: 0.2174, loss_yns_3: 0.0750, loss_cls_4: 1.7034, loss_box_4: 1.7884, loss_cns_4: 0.2312, loss_yns_4: 0.0905, loss_cls_5: 1.8227, loss_box_5: 1.2277, loss_cns_5: 0.1802, loss_yns_5: 0.0637, loss_cls_dn_0: 0.9695, loss_box_dn_0: 1.2779, loss_cls_dn_1: 0.8814, loss_box_dn_1: 2.6075, loss_cls_dn_2: 0.9295, loss_box_dn_2: 2.8197, loss_cls_dn_3: 0.8671, loss_box_dn_3: 2.7295, loss_cls_dn_4: 0.8111, loss_box_dn_4: 2.7771, loss_cls_dn_5: 0.8038, loss_box_dn_5: 2.8359, loss_dense_depth: 2.4449, loss: 41.6449, grad_norm: nan 2024-03-06 09:30:01,307 - mmdet - INFO - Iter [102/281280] lr: 7.020e-06, eta: 3 days, 9:33:33, time: 0.963, data_time: 0.037, memory: 11019, loss_cls_0: 1.2824, loss_box_0: 2.3833, loss_cns_0: 0.5637, loss_yns_0: 0.1705, loss_cls_1: 1.3126, loss_box_1: 2.8148, loss_cns_1: 0.5138, loss_yns_1: 0.1828, loss_cls_2: 1.3180, loss_box_2: 2.8459, loss_cns_2: 0.5212, loss_yns_2: 0.1890, loss_cls_3: 1.2727, loss_box_3: 2.8399, loss_cns_3: 0.5274, loss_yns_3: 0.1833, loss_cls_4: 1.2703, loss_box_4: 2.8295, loss_cns_4: 0.5207, loss_yns_4: 0.1862, loss_cls_5: 1.2690, loss_box_5: 2.8465, loss_cns_5: 0.5147, loss_yns_5: 0.1781, loss_cls_dn_0: 0.5169, loss_box_dn_0: 1.1732, loss_cls_dn_1: 0.4894, loss_box_dn_1: 1.5638, loss_cls_dn_2: 0.4950, loss_box_dn_2: 1.5234, loss_cls_dn_3: 0.4552, loss_box_dn_3: 1.4847, loss_cls_dn_4: 0.4575, loss_box_dn_4: 1.4932, loss_cls_dn_5: 0.4459, loss_box_dn_5: 1.5125, loss_dense_depth: 2.3001, loss: 42.4474, grad_norm: nan 2024-03-06 09:30:50,512 - mmdet - INFO - Iter [153/281280] lr: 8.040e-06, eta: 3 days, 7:28:29, time: 0.965, data_time: 0.036, memory: 11019, loss_cls_0: 1.2446, loss_box_0: 2.1150, loss_cns_0: 0.6217, loss_yns_0: 0.1737, loss_cls_1: 1.2422, loss_box_1: 2.5659, loss_cns_1: 0.5519, loss_yns_1: 0.1723, loss_cls_2: 1.2456, loss_box_2: 2.5861, loss_cns_2: 0.5484, loss_yns_2: 0.1756, loss_cls_3: 1.2417, loss_box_3: 2.6061, loss_cns_3: 0.5441, loss_yns_3: 0.1732, loss_cls_4: 1.2503, loss_box_4: 2.6293, loss_cns_4: 0.5376, loss_yns_4: 0.1727, loss_cls_5: 1.2442, loss_box_5: 2.6548, loss_cns_5: 0.5314, loss_yns_5: 0.1698, loss_cls_dn_0: 0.4821, loss_box_dn_0: 0.9996, loss_cls_dn_1: 0.4204, loss_box_dn_1: 1.1746, loss_cls_dn_2: 0.4258, loss_box_dn_2: 1.1817, loss_cls_dn_3: 0.4223, loss_box_dn_3: 1.2068, loss_cls_dn_4: 0.4279, loss_box_dn_4: 1.2307, loss_cls_dn_5: 0.4250, loss_box_dn_5: 1.2557, loss_dense_depth: 2.3028, loss: 38.9538, grad_norm: 123.5304 2024-03-06 09:31:39,267 - mmdet - INFO - Iter [204/281280] lr: 9.060e-06, eta: 3 days, 6:15:08, time: 0.956, data_time: 0.034, memory: 11019, loss_cls_0: 1.2294, loss_box_0: 2.0897, loss_cns_0: 0.6170, loss_yns_0: 0.1730, loss_cls_1: 1.2328, loss_box_1: 2.5399, loss_cns_1: 0.5596, loss_yns_1: 0.1752, loss_cls_2: 1.2390, loss_box_2: 2.5562, loss_cns_2: 0.5598, loss_yns_2: 0.1783, loss_cls_3: 1.2399, loss_box_3: 2.5670, loss_cns_3: 0.5577, loss_yns_3: 0.1791, loss_cls_4: 1.2423, loss_box_4: 2.5898, loss_cns_4: 0.5548, loss_yns_4: 0.1763, loss_cls_5: 1.2318, loss_box_5: 2.6148, loss_cns_5: 0.5512, loss_yns_5: 0.1732, loss_cls_dn_0: 0.4774, loss_box_dn_0: 0.9729, loss_cls_dn_1: 0.4056, loss_box_dn_1: 1.0719, loss_cls_dn_2: 0.4201, loss_box_dn_2: 1.0757, loss_cls_dn_3: 0.4202, loss_box_dn_3: 1.0935, loss_cls_dn_4: 0.4271, loss_box_dn_4: 1.1190, loss_cls_dn_5: 0.4245, loss_box_dn_5: 1.1483, loss_dense_depth: 2.2174, loss: 38.1015, grad_norm: 78.5033 2024-03-06 09:32:29,139 - mmdet - INFO - Iter [255/281280] lr: 1.008e-05, eta: 3 days, 5:51:29, time: 0.978, data_time: 0.035, memory: 11019, loss_cls_0: 1.2444, loss_box_0: 2.1038, loss_cns_0: 0.6133, loss_yns_0: 0.1698, loss_cls_1: 1.2374, loss_box_1: 2.6724, loss_cns_1: 0.5413, loss_yns_1: 0.1711, loss_cls_2: 1.2487, loss_box_2: 2.6667, loss_cns_2: 0.5445, loss_yns_2: 0.1676, loss_cls_3: 1.2530, loss_box_3: 2.6739, loss_cns_3: 0.5437, loss_yns_3: 0.1700, loss_cls_4: 1.2460, loss_box_4: 2.6840, loss_cns_4: 0.5417, loss_yns_4: 0.1687, loss_cls_5: 1.2473, loss_box_5: 2.7009, loss_cns_5: 0.5397, loss_yns_5: 0.1679, loss_cls_dn_0: 0.4699, loss_box_dn_0: 0.9581, loss_cls_dn_1: 0.3549, loss_box_dn_1: 0.9856, loss_cls_dn_2: 0.3823, loss_box_dn_2: 0.9797, loss_cls_dn_3: 0.3658, loss_box_dn_3: 0.9887, loss_cls_dn_4: 0.3883, loss_box_dn_4: 0.9997, loss_cls_dn_5: 0.3929, loss_box_dn_5: 1.0187, loss_dense_depth: 2.1278, loss: 37.7303, grad_norm: 64.8972 2024-03-06 09:33:34,374 - mmdet - INFO - Iter [306/281280] lr: 1.110e-05, eta: 3 days, 9:30:27, time: 1.279, data_time: 0.053, memory: 11019, loss_cls_0: 1.2185, loss_box_0: 2.1523, loss_cns_0: 0.6030, loss_yns_0: 0.1698, loss_cls_1: 1.2063, loss_box_1: 2.7946, loss_cns_1: 0.5498, loss_yns_1: 0.1694, loss_cls_2: 1.2201, loss_box_2: 2.8015, loss_cns_2: 0.5498, loss_yns_2: 0.1697, loss_cls_3: 1.2310, loss_box_3: 2.8041, loss_cns_3: 0.5502, loss_yns_3: 0.1705, loss_cls_4: 1.2300, loss_box_4: 2.8167, loss_cns_4: 0.5494, loss_yns_4: 0.1698, loss_cls_5: 1.2328, loss_box_5: 2.8285, loss_cns_5: 0.5470, loss_yns_5: 0.1699, loss_cls_dn_0: 0.3938, loss_box_dn_0: 0.9330, loss_cls_dn_1: 0.2560, loss_box_dn_1: 0.9466, loss_cls_dn_2: 0.2624, loss_box_dn_2: 0.9411, loss_cls_dn_3: 0.2499, loss_box_dn_3: 0.9452, loss_cls_dn_4: 0.2751, loss_box_dn_4: 0.9532, loss_cls_dn_5: 0.2607, loss_box_dn_5: 0.9652, loss_dense_depth: 2.0162, loss: 37.3032, grad_norm: 69.9023

linxuewu commented 7 months ago

pretrained='ckpt/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth'

这个pretrain加载成功了吗?

linxuewu commented 7 months ago

cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth带prefix,这样加载不上吧

zhaoyangwei123 commented 7 months ago

真的很感谢,之前好像确实是没有加载成功,我利用load_from之后第一轮的grad_norm就好了,等待跑完之后看能否复现性能

zhaoyangwei123 commented 7 months ago

您好,我昨天用load_from加载权重之后第一轮训练没有出现grad_norm=nan的情况,但是在评测的时候我发现mAP还是只有几个点(如下),然后我又重新看了一下log文件,发现权重虽然load成功了但是在load的过程中出现了这个: 2024-03-06 18:54:43,125 - mmdet - INFO - load checkpoint from local path: ckpt/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth 2024-03-06 18:54:43,538 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.bn2.num_batches_tracked, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.1.bn3.num_batches_tracked, backbone.layer2.2.conv1.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.bn1.num_batches_tracked, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.bn2.num_batches_tracked, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.2.bn3.num_batches_tracked, backbone.layer2.3.conv1.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.bn1.num_batches_tracked, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.bn2.num_batches_tracked, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer2.3.bn3.num_batches_tracked, backbone.layer3.0.conv1.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.bn1.num_batches_tracked, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.bn2.num_batches_tracked, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.bn3.num_batches_tracked, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.0.downsample.1.num_batches_tracked, backbone.layer3.1.conv1.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.bn1.num_batches_tracked, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.bn2.num_batches_tracked, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.1.bn3.num_batches_tracked, backbone.layer3.2.conv1.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.bn1.num_batches_tracked, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.bn2.num_batches_tracked, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.2.bn3.num_batches_tracked, backbone.layer3.3.conv1.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.bn1.num_batches_tracked, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.bn2.num_batches_tracked, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.3.bn3.num_batches_tracked, backbone.layer3.4.conv1.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.bn1.num_batches_tracked, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.bn2.num_batches_tracked, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.4.bn3.num_batches_tracked, backbone.layer3.5.conv1.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.bn1.num_batches_tracked, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.bn2.num_batches_tracked, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer3.5.bn3.num_batches_tracked, backbone.layer3.6.conv1.weight, backbone.layer3.6.bn1.weight, backbone.layer3.6.bn1.bias, backbone.layer3.6.bn1.running_mean, backbone.layer3.6.bn1.running_var, backbone.layer3.6.bn1.num_batches_tracked, backbone.layer3.6.conv2.weight, backbone.layer3.6.bn2.weight, backbone.layer3.6.bn2.bias, backbone.layer3.6.bn2.running_mean, backbone.layer3.6.bn2.running_var, backbone.layer3.6.bn2.num_batches_tracked, backbone.layer3.6.conv3.weight, backbone.layer3.6.bn3.weight, backbone.layer3.6.bn3.bias, backbone.layer3.6.bn3.running_mean, backbone.layer3.6.bn3.running_var, backbone.layer3.6.bn3.num_batches_tracked, backbone.layer3.7.conv1.weight, backbone.layer3.7.bn1.weight, backbone.layer3.7.bn1.bias, backbone.layer3.7.bn1.running_mean, backbone.layer3.7.bn1.running_var, backbone.layer3.7.bn1.num_batches_tracked, backbone.layer3.7.conv2.weight, backbone.layer3.7.bn2.weight, backbone.layer3.7.bn2.bias, backbone.layer3.7.bn2.running_mean, backbone.layer3.7.bn2.running_var, backbone.layer3.7.bn2.num_batches_tracked, backbone.layer3.7.conv3.weight, backbone.layer3.7.bn3.weight, backbone.layer3.7.bn3.bias, backbone.layer3.7.bn3.running_mean, backbone.layer3.7.bn3.running_var, backbone.layer3.7.bn3.num_batches_tracked, backbone.layer3.8.conv1.weight, backbone.layer3.8.bn1.weight, backbone.layer3.8.bn1.bias, backbone.layer3.8.bn1.running_mean, backbone.layer3.8.bn1.running_var, backbone.layer3.8.bn1.num_batches_tracked, backbone.layer3.8.conv2.weight, backbone.layer3.8.bn2.weight, backbone.layer3.8.bn2.bias, backbone.layer3.8.bn2.running_mean, backbone.layer3.8.bn2.running_var, backbone.layer3.8.bn2.num_batches_tracked, backbone.layer3.8.conv3.weight, backbone.layer3.8.bn3.weight, backbone.layer3.8.bn3.bias, backbone.layer3.8.bn3.running_mean, backbone.layer3.8.bn3.running_var, backbone.layer3.8.bn3.num_batches_tracked, backbone.layer3.9.conv1.weight, backbone.layer3.9.bn1.weight, backbone.layer3.9.bn1.bias, backbone.layer3.9.bn1.running_mean, backbone.layer3.9.bn1.running_var, backbone.layer3.9.bn1.num_batches_tracked, backbone.layer3.9.conv2.weight, backbone.layer3.9.bn2.weight, backbone.layer3.9.bn2.bias, backbone.layer3.9.bn2.running_mean, backbone.layer3.9.bn2.running_var, backbone.layer3.9.bn2.num_batches_tracked, backbone.layer3.9.conv3.weight, backbone.layer3.9.bn3.weight, backbone.layer3.9.bn3.bias, backbone.layer3.9.bn3.running_mean, backbone.layer3.9.bn3.running_var, backbone.layer3.9.bn3.num_batches_tracked, backbone.layer3.10.conv1.weight, backbone.layer3.10.bn1.weight, backbone.layer3.10.bn1.bias, backbone.layer3.10.bn1.running_mean, backbone.layer3.10.bn1.running_var, backbone.layer3.10.bn1.num_batches_tracked, backbone.layer3.10.conv2.weight, backbone.layer3.10.bn2.weight, backbone.layer3.10.bn2.bias, backbone.layer3.10.bn2.running_mean, backbone.layer3.10.bn2.running_var, backbone.layer3.10.bn2.num_batches_tracked, backbone.layer3.10.conv3.weight, backbone.layer3.10.bn3.weight, backbone.layer3.10.bn3.bias, backbone.layer3.10.bn3.running_mean, backbone.layer3.10.bn3.running_var, backbone.layer3.10.bn3.num_batches_tracked, backbone.layer3.11.conv1.weight, backbone.layer3.11.bn1.weight, backbone.layer3.11.bn1.bias, backbone.layer3.11.bn1.running_mean, backbone.layer3.11.bn1.running_var, backbone.layer3.11.bn1.num_batches_tracked, backbone.layer3.11.conv2.weight, backbone.layer3.11.bn2.weight, backbone.layer3.11.bn2.bias, backbone.layer3.11.bn2.running_mean, backbone.layer3.11.bn2.running_var, backbone.layer3.11.bn2.num_batches_tracked, backbone.layer3.11.conv3.weight, backbone.layer3.11.bn3.weight, backbone.layer3.11.bn3.bias, backbone.layer3.11.bn3.running_mean, backbone.layer3.11.bn3.running_var, backbone.layer3.11.bn3.num_batches_tracked, backbone.layer3.12.conv1.weight, backbone.layer3.12.bn1.weight, backbone.layer3.12.bn1.bias, backbone.layer3.12.bn1.running_mean, backbone.layer3.12.bn1.running_var, backbone.layer3.12.bn1.num_batches_tracked, backbone.layer3.12.conv2.weight, backbone.layer3.12.bn2.weight, backbone.layer3.12.bn2.bias, backbone.layer3.12.bn2.running_mean, backbone.layer3.12.bn2.running_var, backbone.layer3.12.bn2.num_batches_tracked, backbone.layer3.12.conv3.weight, backbone.layer3.12.bn3.weight, backbone.layer3.12.bn3.bias, backbone.layer3.12.bn3.running_mean, backbone.layer3.12.bn3.running_var, backbone.layer3.12.bn3.num_batches_tracked, backbone.layer3.13.conv1.weight, backbone.layer3.13.bn1.weight, backbone.layer3.13.bn1.bias, backbone.layer3.13.bn1.running_mean, backbone.layer3.13.bn1.running_var, backbone.layer3.13.bn1.num_batches_tracked, backbone.layer3.13.conv2.weight, backbone.layer3.13.bn2.weight, backbone.layer3.13.bn2.bias, backbone.layer3.13.bn2.running_mean, backbone.layer3.13.bn2.running_var, backbone.layer3.13.bn2.num_batches_tracked, backbone.layer3.13.conv3.weight, backbone.layer3.13.bn3.weight, backbone.layer3.13.bn3.bias, backbone.layer3.13.bn3.running_mean, backbone.layer3.13.bn3.running_var, backbone.layer3.13.bn3.num_batches_tracked, backbone.layer3.14.conv1.weight, backbone.layer3.14.bn1.weight, backbone.layer3.14.bn1.bias, backbone.layer3.14.bn1.running_mean, backbone.layer3.14.bn1.running_var, backbone.layer3.14.bn1.num_batches_tracked, backbone.layer3.14.conv2.weight, backbone.layer3.14.bn2.weight, backbone.layer3.14.bn2.bias, backbone.layer3.14.bn2.running_mean, backbone.layer3.14.bn2.running_var, backbone.layer3.14.bn2.num_batches_tracked, backbone.layer3.14.conv3.weight, backbone.layer3.14.bn3.weight, backbone.layer3.14.bn3.bias, backbone.layer3.14.bn3.running_mean, backbone.layer3.14.bn3.running_var, backbone.layer3.14.bn3.num_batches_tracked, backbone.layer3.15.conv1.weight, backbone.layer3.15.bn1.weight, backbone.layer3.15.bn1.bias, backbone.layer3.15.bn1.running_mean, backbone.layer3.15.bn1.running_var, backbone.layer3.15.bn1.num_batches_tracked, backbone.layer3.15.conv2.weight, backbone.layer3.15.bn2.weight, backbone.layer3.15.bn2.bias, backbone.layer3.15.bn2.running_mean, backbone.layer3.15.bn2.running_var, backbone.layer3.15.bn2.num_batches_tracked, backbone.layer3.15.conv3.weight, backbone.layer3.15.bn3.weight, backbone.layer3.15.bn3.bias, backbone.layer3.15.bn3.running_mean, backbone.layer3.15.bn3.running_var, backbone.layer3.15.bn3.num_batches_tracked, backbone.layer3.16.conv1.weight, backbone.layer3.16.bn1.weight, backbone.layer3.16.bn1.bias, backbone.layer3.16.bn1.running_mean, backbone.layer3.16.bn1.running_var, backbone.layer3.16.bn1.num_batches_tracked, backbone.layer3.16.conv2.weight, backbone.layer3.16.bn2.weight, backbone.layer3.16.bn2.bias, backbone.layer3.16.bn2.running_mean, backbone.layer3.16.bn2.running_var, backbone.layer3.16.bn2.num_batches_tracked, backbone.layer3.16.conv3.weight, backbone.layer3.16.bn3.weight, backbone.layer3.16.bn3.bias, backbone.layer3.16.bn3.running_mean, backbone.layer3.16.bn3.running_var, backbone.layer3.16.bn3.num_batches_tracked, backbone.layer3.17.conv1.weight, backbone.layer3.17.bn1.weight, backbone.layer3.17.bn1.bias, backbone.layer3.17.bn1.running_mean, backbone.layer3.17.bn1.running_var, backbone.layer3.17.bn1.num_batches_tracked, backbone.layer3.17.conv2.weight, backbone.layer3.17.bn2.weight, backbone.layer3.17.bn2.bias, backbone.layer3.17.bn2.running_mean, backbone.layer3.17.bn2.running_var, backbone.layer3.17.bn2.num_batches_tracked, backbone.layer3.17.conv3.weight, backbone.layer3.17.bn3.weight, backbone.layer3.17.bn3.bias, backbone.layer3.17.bn3.running_mean, backbone.layer3.17.bn3.running_var, backbone.layer3.17.bn3.num_batches_tracked, backbone.layer3.18.conv1.weight, backbone.layer3.18.bn1.weight, backbone.layer3.18.bn1.bias, backbone.layer3.18.bn1.running_mean, backbone.layer3.18.bn1.running_var, backbone.layer3.18.bn1.num_batches_tracked, backbone.layer3.18.conv2.weight, backbone.layer3.18.bn2.weight, backbone.layer3.18.bn2.bias, backbone.layer3.18.bn2.running_mean, backbone.layer3.18.bn2.running_var, backbone.layer3.18.bn2.num_batches_tracked, backbone.layer3.18.conv3.weight, backbone.layer3.18.bn3.weight, backbone.layer3.18.bn3.bias, backbone.layer3.18.bn3.running_mean, backbone.layer3.18.bn3.running_var, backbone.layer3.18.bn3.num_batches_tracked, backbone.layer3.19.conv1.weight, backbone.layer3.19.bn1.weight, backbone.layer3.19.bn1.bias, backbone.layer3.19.bn1.running_mean, backbone.layer3.19.bn1.running_var, backbone.layer3.19.bn1.num_batches_tracked, backbone.layer3.19.conv2.weight, backbone.layer3.19.bn2.weight, backbone.layer3.19.bn2.bias, backbone.layer3.19.bn2.running_mean, backbone.layer3.19.bn2.running_var, backbone.layer3.19.bn2.num_batches_tracked, backbone.layer3.19.conv3.weight, backbone.layer3.19.bn3.weight, backbone.layer3.19.bn3.bias, backbone.layer3.19.bn3.running_mean, backbone.layer3.19.bn3.running_var, backbone.layer3.19.bn3.num_batches_tracked, backbone.layer3.20.conv1.weight, backbone.layer3.20.bn1.weight, backbone.layer3.20.bn1.bias, backbone.layer3.20.bn1.running_mean, backbone.layer3.20.bn1.running_var, backbone.layer3.20.bn1.num_batches_tracked, backbone.layer3.20.conv2.weight, backbone.layer3.20.bn2.weight, backbone.layer3.20.bn2.bias, backbone.layer3.20.bn2.running_mean, backbone.layer3.20.bn2.running_var, backbone.layer3.20.bn2.num_batches_tracked, backbone.layer3.20.conv3.weight, backbone.layer3.20.bn3.weight, backbone.layer3.20.bn3.bias, backbone.layer3.20.bn3.running_mean, backbone.layer3.20.bn3.running_var, backbone.layer3.20.bn3.num_batches_tracked, backbone.layer3.21.conv1.weight, backbone.layer3.21.bn1.weight, backbone.layer3.21.bn1.bias, backbone.layer3.21.bn1.running_mean, backbone.layer3.21.bn1.running_var, backbone.layer3.21.bn1.num_batches_tracked, backbone.layer3.21.conv2.weight, backbone.layer3.21.bn2.weight, backbone.layer3.21.bn2.bias, backbone.layer3.21.bn2.running_mean, backbone.layer3.21.bn2.running_var, backbone.layer3.21.bn2.num_batches_tracked, backbone.layer3.21.conv3.weight, backbone.layer3.21.bn3.weight, backbone.layer3.21.bn3.bias, backbone.layer3.21.bn3.running_mean, backbone.layer3.21.bn3.running_var, backbone.layer3.21.bn3.num_batches_tracked, backbone.layer3.22.conv1.weight, backbone.layer3.22.bn1.weight, backbone.layer3.22.bn1.bias, backbone.layer3.22.bn1.running_mean, backbone.layer3.22.bn1.running_var, backbone.layer3.22.bn1.num_batches_tracked, backbone.layer3.22.conv2.weight, backbone.layer3.22.bn2.weight, backbone.layer3.22.bn2.bias, backbone.layer3.22.bn2.running_mean, backbone.layer3.22.bn2.running_var, backbone.layer3.22.bn2.num_batches_tracked, backbone.layer3.22.conv3.weight, backbone.layer3.22.bn3.weight, backbone.layer3.22.bn3.bias, backbone.layer3.22.bn3.running_mean, backbone.layer3.22.bn3.running_var, backbone.layer3.22.bn3.num_batches_tracked, backbone.layer4.0.conv1.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.bn1.num_batches_tracked, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.bn2.num_batches_tracked, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.bn3.num_batches_tracked, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.0.downsample.1.num_batches_tracked, backbone.layer4.1.conv1.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.bn1.num_batches_tracked, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.bn2.num_batches_tracked, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.1.bn3.num_batches_tracked, backbone.layer4.2.conv1.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.bn1.num_batches_tracked, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.bn2.num_batches_tracked, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var, backbone.layer4.2.bn3.num_batches_tracked, neck.lateral_convs.0.conv.weight, neck.lateral_convs.0.conv.bias, neck.lateral_convs.1.conv.weight, neck.lateral_convs.1.conv.bias, neck.lateral_convs.2.conv.weight, neck.lateral_convs.2.conv.bias, neck.lateral_convs.3.conv.weight, neck.lateral_convs.3.conv.bias, neck.fpn_convs.0.conv.weight, neck.fpn_convs.0.conv.bias, neck.fpn_convs.1.conv.weight, neck.fpn_convs.1.conv.bias, neck.fpn_convs.2.conv.weight, neck.fpn_convs.2.conv.bias, neck.fpn_convs.3.conv.weight, neck.fpn_convs.3.conv.bias, rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias, roi_head.bbox_head.0.fc_cls.weight, roi_head.bbox_head.0.fc_cls.bias, roi_head.bbox_head.0.fc_reg.weight, roi_head.bbox_head.0.fc_reg.bias, roi_head.bbox_head.0.shared_fcs.0.weight, roi_head.bbox_head.0.shared_fcs.0.bias, roi_head.bbox_head.0.shared_fcs.1.weight, roi_head.bbox_head.0.shared_fcs.1.bias, roi_head.bbox_head.1.fc_cls.weight, roi_head.bbox_head.1.fc_cls.bias, roi_head.bbox_head.1.fc_reg.weight, roi_head.bbox_head.1.fc_reg.bias, roi_head.bbox_head.1.shared_fcs.0.weight, roi_head.bbox_head.1.shared_fcs.0.bias, roi_head.bbox_head.1.shared_fcs.1.weight, roi_head.bbox_head.1.shared_fcs.1.bias, roi_head.bbox_head.2.fc_cls.weight, roi_head.bbox_head.2.fc_cls.bias, roi_head.bbox_head.2.fc_reg.weight, roi_head.bbox_head.2.fc_reg.bias, roi_head.bbox_head.2.shared_fcs.0.weight, roi_head.bbox_head.2.shared_fcs.0.bias, roi_head.bbox_head.2.shared_fcs.1.weight, roi_head.bbox_head.2.shared_fcs.1.bias, roi_head.mask_head.0.convs.0.conv.weight, roi_head.mask_head.0.convs.0.conv.bias, roi_head.mask_head.0.convs.1.conv.weight, roi_head.mask_head.0.convs.1.conv.bias, roi_head.mask_head.0.convs.2.conv.weight, roi_head.mask_head.0.convs.2.conv.bias, roi_head.mask_head.0.convs.3.conv.weight, roi_head.mask_head.0.convs.3.conv.bias, roi_head.mask_head.0.upsample.weight, roi_head.mask_head.0.upsample.bias, roi_head.mask_head.0.conv_logits.weight, roi_head.mask_head.0.conv_logits.bias, roi_head.mask_head.1.convs.0.conv.weight, roi_head.mask_head.1.convs.0.conv.bias, roi_head.mask_head.1.convs.1.conv.weight, roi_head.mask_head.1.convs.1.conv.bias, roi_head.mask_head.1.convs.2.conv.weight, roi_head.mask_head.1.convs.2.conv.bias, roi_head.mask_head.1.convs.3.conv.weight, roi_head.mask_head.1.convs.3.conv.bias, roi_head.mask_head.1.upsample.weight, roi_head.mask_head.1.upsample.bias, roi_head.mask_head.1.conv_logits.weight, roi_head.mask_head.1.conv_logits.bias, roi_head.mask_head.2.convs.0.conv.weight, roi_head.mask_head.2.convs.0.conv.bias, roi_head.mask_head.2.convs.1.conv.weight, roi_head.mask_head.2.convs.1.conv.bias, roi_head.mask_head.2.convs.2.conv.weight, roi_head.mask_head.2.convs.2.conv.bias, roi_head.mask_head.2.convs.3.conv.weight, roi_head.mask_head.2.convs.3.conv.bias, roi_head.mask_head.2.upsample.weight, roi_head.mask_head.2.upsample.bias, roi_head.mask_head.2.conv_logits.weight, roi_head.mask_head.2.conv_logits.bias

missing keys in source state_dict: img_backbone.conv1.weight, img_backbone.bn1.weight, img_backbone.bn1.bias, img_backbone.bn1.running_mean, img_backbone.bn1.running_var, img_backbone.layer1.0.conv1.weight, img_backbone.layer1.0.bn1.weight, img_backbone.layer1.0.bn1.bias, img_backbone.layer1.0.bn1.running_mean, img_backbone.layer1.0.bn1.running_var, img_backbone.layer1.0.conv2.weight, img_backbone.layer1.0.bn2.weight, img_backbone.layer1.0.bn2.bias, img_backbone.layer1.0.bn2.running_mean, img_backbone.layer1.0.bn2.running_var, img_backbone.layer1.0.conv3.weight, img_backbone.layer1.0.bn3.weight, img_backbone.layer1.0.bn3.bias, img_backbone.layer1.0.bn3.running_mean, img_backbone.layer1.0.bn3.running_var, img_backbone.layer1.0.downsample.0.weight, img_backbone.layer1.0.downsample.1.weight, img_backbone.layer1.0.downsample.1.bias, img_backbone.layer1.0.downsample.1.running_mean, img_backbone.layer1.0.downsample.1.running_var, img_backbone.layer1.1.conv1.weight, img_backbone.layer1.1.bn1.weight, img_backbone.layer1.1.bn1.bias, img_backbone.layer1.1.bn1.running_mean, img_backbone.layer1.1.bn1.running_var, img_backbone.layer1.1.conv2.weight, img_backbone.layer1.1.bn2.weight, img_backbone.layer1.1.bn2.bias, img_backbone.layer1.1.bn2.running_mean, img_backbone.layer1.1.bn2.running_var, img_backbone.layer1.1.conv3.weight, img_backbone.layer1.1.bn3.weight, img_backbone.layer1.1.bn3.bias, img_backbone.layer1.1.bn3.running_mean, img_backbone.layer1.1.bn3.running_var, img_backbone.layer1.2.conv1.weight, img_backbone.layer1.2.bn1.weight, img_backbone.layer1.2.bn1.bias, img_backbone.layer1.2.bn1.running_mean, img_backbone.layer1.2.bn1.running_var, img_backbone.layer1.2.conv2.weight, img_backbone.layer1.2.bn2.weight, img_backbone.layer1.2.bn2.bias, img_backbone.layer1.2.bn2.running_mean, img_backbone.layer1.2.bn2.running_var, img_backbone.layer1.2.conv3.weight, img_backbone.layer1.2.bn3.weight, img_backbone.layer1.2.bn3.bias, img_backbone.layer1.2.bn3.running_mean, img_backbone.layer1.2.bn3.running_var, img_backbone.layer2.0.conv1.weight, img_backbone.layer2.0.bn1.weight, img_backbone.layer2.0.bn1.bias, img_backbone.layer2.0.bn1.running_mean, img_backbone.layer2.0.bn1.running_var, img_backbone.layer2.0.conv2.weight, img_backbone.layer2.0.bn2.weight, img_backbone.layer2.0.bn2.bias, img_backbone.layer2.0.bn2.running_mean, img_backbone.layer2.0.bn2.running_var, img_backbone.layer2.0.conv3.weight, img_backbone.layer2.0.bn3.weight, img_backbone.layer2.0.bn3.bias, img_backbone.layer2.0.bn3.running_mean, img_backbone.layer2.0.bn3.running_var, img_backbone.layer2.0.downsample.0.weight, img_backbone.layer2.0.downsample.1.weight, img_backbone.layer2.0.downsample.1.bias, img_backbone.layer2.0.downsample.1.running_mean, img_backbone.layer2.0.downsample.1.running_var, img_backbone.layer2.1.conv1.weight, img_backbone.layer2.1.bn1.weight, img_backbone.layer2.1.bn1.bias, img_backbone.layer2.1.bn1.running_mean, img_backbone.layer2.1.bn1.running_var, img_backbone.layer2.1.conv2.weight, img_backbone.layer2.1.bn2.weight, img_backbone.layer2.1.bn2.bias, img_backbone.layer2.1.bn2.running_mean, img_backbone.layer2.1.bn2.running_var, img_backbone.layer2.1.conv3.weight, img_backbone.layer2.1.bn3.weight, img_backbone.layer2.1.bn3.bias, img_backbone.layer2.1.bn3.running_mean, img_backbone.layer2.1.bn3.running_var, img_backbone.layer2.2.conv1.weight, img_backbone.layer2.2.bn1.weight, img_backbone.layer2.2.bn1.bias, img_backbone.layer2.2.bn1.running_mean, img_backbone.layer2.2.bn1.running_var, img_backbone.layer2.2.conv2.weight, img_backbone.layer2.2.bn2.weight, img_backbone.layer2.2.bn2.bias, img_backbone.layer2.2.bn2.running_mean, img_backbone.layer2.2.bn2.running_var, img_backbone.layer2.2.conv3.weight, img_backbone.layer2.2.bn3.weight, img_backbone.layer2.2.bn3.bias, img_backbone.layer2.2.bn3.running_mean, img_backbone.layer2.2.bn3.running_var, img_backbone.layer2.3.conv1.weight, img_backbone.layer2.3.bn1.weight, img_backbone.layer2.3.bn1.bias, img_backbone.layer2.3.bn1.running_mean, img_backbone.layer2.3.bn1.running_var, img_backbone.layer2.3.conv2.weight, img_backbone.layer2.3.bn2.weight, img_backbone.layer2.3.bn2.bias, img_backbone.layer2.3.bn2.running_mean, img_backbone.layer2.3.bn2.running_var, img_backbone.layer2.3.conv3.weight, img_backbone.layer2.3.bn3.weight, img_backbone.layer2.3.bn3.bias, img_backbone.layer2.3.bn3.running_mean, img_backbone.layer2.3.bn3.running_var, img_backbone.layer3.0.conv1.weight, img_backbone.layer3.0.bn1.weight, img_backbone.layer3.0.bn1.bias, img_backbone.layer3.0.bn1.running_mean, img_backbone.layer3.0.bn1.running_var, img_backbone.layer3.0.conv2.weight, img_backbone.layer3.0.bn2.weight, img_backbone.layer3.0.bn2.bias, img_backbone.layer3.0.bn2.running_mean, img_backbone.layer3.0.bn2.running_var, img_backbone.layer3.0.conv3.weight, img_backbone.layer3.0.bn3.weight, img_backbone.layer3.0.bn3.bias, img_backbone.layer3.0.bn3.running_mean, img_backbone.layer3.0.bn3.running_var, img_backbone.layer3.0.downsample.0.weight, img_backbone.layer3.0.downsample.1.weight, img_backbone.layer3.0.downsample.1.bias, img_backbone.layer3.0.downsample.1.running_mean, img_backbone.layer3.0.downsample.1.running_var, img_backbone.layer3.1.conv1.weight, img_backbone.layer3.1.bn1.weight, img_backbone.layer3.1.bn1.bias, img_backbone.layer3.1.bn1.running_mean, img_backbone.layer3.1.bn1.running_var, img_backbone.layer3.1.conv2.weight, img_backbone.layer3.1.bn2.weight, img_backbone.layer3.1.bn2.bias, img_backbone.layer3.1.bn2.running_mean, img_backbone.layer3.1.bn2.running_var, img_backbone.layer3.1.conv3.weight, img_backbone.layer3.1.bn3.weight, img_backbone.layer3.1.bn3.bias, img_backbone.layer3.1.bn3.running_mean, img_backbone.layer3.1.bn3.running_var, img_backbone.layer3.2.conv1.weight, img_backbone.layer3.2.bn1.weight, img_backbone.layer3.2.bn1.bias, img_backbone.layer3.2.bn1.running_mean, img_backbone.layer3.2.bn1.running_var, img_backbone.layer3.2.conv2.weight, img_backbone.layer3.2.bn2.weight, img_backbone.layer3.2.bn2.bias, img_backbone.layer3.2.bn2.running_mean, img_backbone.layer3.2.bn2.running_var, img_backbone.layer3.2.conv3.weight, img_backbone.layer3.2.bn3.weight, img_backbone.layer3.2.bn3.bias, img_backbone.layer3.2.bn3.running_mean, img_backbone.layer3.2.bn3.running_var, img_backbone.layer3.3.conv1.weight, img_backbone.layer3.3.bn1.weight, img_backbone.layer3.3.bn1.bias, img_backbone.layer3.3.bn1.running_mean, img_backbone.layer3.3.bn1.running_var, img_backbone.layer3.3.conv2.weight, img_backbone.layer3.3.bn2.weight, img_backbone.layer3.3.bn2.bias, img_backbone.layer3.3.bn2.running_mean, img_backbone.layer3.3.bn2.running_var, img_backbone.layer3.3.conv3.weight, img_backbone.layer3.3.bn3.weight, img_backbone.layer3.3.bn3.bias, img_backbone.layer3.3.bn3.running_mean, img_backbone.layer3.3.bn3.running_var, img_backbone.layer3.4.conv1.weight, img_backbone.layer3.4.bn1.weight, img_backbone.layer3.4.bn1.bias, img_backbone.layer3.4.bn1.running_mean, img_backbone.layer3.4.bn1.running_var, img_backbone.layer3.4.conv2.weight, img_backbone.layer3.4.bn2.weight, img_backbone.layer3.4.bn2.bias, img_backbone.layer3.4.bn2.running_mean, img_backbone.layer3.4.bn2.running_var, img_backbone.layer3.4.conv3.weight, img_backbone.layer3.4.bn3.weight, img_backbone.layer3.4.bn3.bias, img_backbone.layer3.4.bn3.running_mean, img_backbone.layer3.4.bn3.running_var, img_backbone.layer3.5.conv1.weight, img_backbone.layer3.5.bn1.weight, img_backbone.layer3.5.bn1.bias, img_backbone.layer3.5.bn1.running_mean, img_backbone.layer3.5.bn1.running_var, img_backbone.layer3.5.conv2.weight, img_backbone.layer3.5.bn2.weight, img_backbone.layer3.5.bn2.bias, img_backbone.layer3.5.bn2.running_mean, img_backbone.layer3.5.bn2.running_var, img_backbone.layer3.5.conv3.weight, img_backbone.layer3.5.bn3.weight, img_backbone.layer3.5.bn3.bias, img_backbone.layer3.5.bn3.running_mean, img_backbone.layer3.5.bn3.running_var, img_backbone.layer3.6.conv1.weight, img_backbone.layer3.6.bn1.weight, img_backbone.layer3.6.bn1.bias, img_backbone.layer3.6.bn1.running_mean, img_backbone.layer3.6.bn1.running_var, img_backbone.layer3.6.conv2.weight, img_backbone.layer3.6.bn2.weight, img_backbone.layer3.6.bn2.bias, img_backbone.layer3.6.bn2.running_mean, img_backbone.layer3.6.bn2.running_var, img_backbone.layer3.6.conv3.weight, img_backbone.layer3.6.bn3.weight, img_backbone.layer3.6.bn3.bias, img_backbone.layer3.6.bn3.running_mean, img_backbone.layer3.6.bn3.running_var, img_backbone.layer3.7.conv1.weight, img_backbone.layer3.7.bn1.weight, img_backbone.layer3.7.bn1.bias, img_backbone.layer3.7.bn1.running_mean, img_backbone.layer3.7.bn1.running_var, img_backbone.layer3.7.conv2.weight, img_backbone.layer3.7.bn2.weight, img_backbone.layer3.7.bn2.bias, img_backbone.layer3.7.bn2.running_mean, img_backbone.layer3.7.bn2.running_var, img_backbone.layer3.7.conv3.weight, img_backbone.layer3.7.bn3.weight, img_backbone.layer3.7.bn3.bias, img_backbone.layer3.7.bn3.running_mean, img_backbone.layer3.7.bn3.running_var, img_backbone.layer3.8.conv1.weight, img_backbone.layer3.8.bn1.weight, img_backbone.layer3.8.bn1.bias, img_backbone.layer3.8.bn1.running_mean, img_backbone.layer3.8.bn1.running_var, img_backbone.layer3.8.conv2.weight, img_backbone.layer3.8.bn2.weight, img_backbone.layer3.8.bn2.bias, img_backbone.layer3.8.bn2.running_mean, img_backbone.layer3.8.bn2.running_var, img_backbone.layer3.8.conv3.weight, img_backbone.layer3.8.bn3.weight, img_backbone.layer3.8.bn3.bias, img_backbone.layer3.8.bn3.running_mean, img_backbone.layer3.8.bn3.running_var, img_backbone.layer3.9.conv1.weight, img_backbone.layer3.9.bn1.weight, img_backbone.layer3.9.bn1.bias, img_backbone.layer3.9.bn1.running_mean, img_backbone.layer3.9.bn1.running_var, img_backbone.layer3.9.conv2.weight, img_backbone.layer3.9.bn2.weight, img_backbone.layer3.9.bn2.bias, img_backbone.layer3.9.bn2.running_mean, img_backbone.layer3.9.bn2.running_var, img_backbone.layer3.9.conv3.weight, img_backbone.layer3.9.bn3.weight, img_backbone.layer3.9.bn3.bias, img_backbone.layer3.9.bn3.running_mean, img_backbone.layer3.9.bn3.running_var, img_backbone.layer3.10.conv1.weight, img_backbone.layer3.10.bn1.weight, img_backbone.layer3.10.bn1.bias, img_backbone.layer3.10.bn1.running_mean, img_backbone.layer3.10.bn1.running_var, img_backbone.layer3.10.conv2.weight, img_backbone.layer3.10.bn2.weight, img_backbone.layer3.10.bn2.bias, img_backbone.layer3.10.bn2.running_mean, img_backbone.layer3.10.bn2.running_var, img_backbone.layer3.10.conv3.weight, img_backbone.layer3.10.bn3.weight, img_backbone.layer3.10.bn3.bias, img_backbone.layer3.10.bn3.running_mean, img_backbone.layer3.10.bn3.running_var, img_backbone.layer3.11.conv1.weight, img_backbone.layer3.11.bn1.weight, img_backbone.layer3.11.bn1.bias, img_backbone.layer3.11.bn1.running_mean, img_backbone.layer3.11.bn1.running_var, img_backbone.layer3.11.conv2.weight, img_backbone.layer3.11.bn2.weight, img_backbone.layer3.11.bn2.bias, img_backbone.layer3.11.bn2.running_mean, img_backbone.layer3.11.bn2.running_var, img_backbone.layer3.11.conv3.weight, img_backbone.layer3.11.bn3.weight, img_backbone.layer3.11.bn3.bias, img_backbone.layer3.11.bn3.running_mean, img_backbone.layer3.11.bn3.running_var, img_backbone.layer3.12.conv1.weight, img_backbone.layer3.12.bn1.weight, img_backbone.layer3.12.bn1.bias, img_backbone.layer3.12.bn1.running_mean, img_backbone.layer3.12.bn1.running_var, img_backbone.layer3.12.conv2.weight, img_backbone.layer3.12.bn2.weight, img_backbone.layer3.12.bn2.bias, img_backbone.layer3.12.bn2.running_mean, img_backbone.layer3.12.bn2.running_var, img_backbone.layer3.12.conv3.weight, img_backbone.layer3.12.bn3.weight, img_backbone.layer3.12.bn3.bias, img_backbone.layer3.12.bn3.running_mean, img_backbone.layer3.12.bn3.running_var, img_backbone.layer3.13.conv1.weight, img_backbone.layer3.13.bn1.weight, img_backbone.layer3.13.bn1.bias, img_backbone.layer3.13.bn1.running_mean, img_backbone.layer3.13.bn1.running_var, img_backbone.layer3.13.conv2.weight, img_backbone.layer3.13.bn2.weight, img_backbone.layer3.13.bn2.bias, img_backbone.layer3.13.bn2.running_mean, img_backbone.layer3.13.bn2.running_var, img_backbone.layer3.13.conv3.weight, img_backbone.layer3.13.bn3.weight, img_backbone.layer3.13.bn3.bias, img_backbone.layer3.13.bn3.running_mean, img_backbone.layer3.13.bn3.running_var, img_backbone.layer3.14.conv1.weight, img_backbone.layer3.14.bn1.weight, img_backbone.layer3.14.bn1.bias, img_backbone.layer3.14.bn1.running_mean, img_backbone.layer3.14.bn1.running_var, img_backbone.layer3.14.conv2.weight, img_backbone.layer3.14.bn2.weight, img_backbone.layer3.14.bn2.bias, img_backbone.layer3.14.bn2.running_mean, img_backbone.layer3.14.bn2.running_var, img_backbone.layer3.14.conv3.weight, img_backbone.layer3.14.bn3.weight, img_backbone.layer3.14.bn3.bias, img_backbone.layer3.14.bn3.running_mean, img_backbone.layer3.14.bn3.running_var, img_backbone.layer3.15.conv1.weight, img_backbone.layer3.15.bn1.weight, img_backbone.layer3.15.bn1.bias, img_backbone.layer3.15.bn1.running_mean, img_backbone.layer3.15.bn1.running_var, img_backbone.layer3.15.conv2.weight, img_backbone.layer3.15.bn2.weight, img_backbone.layer3.15.bn2.bias, img_backbone.layer3.15.bn2.running_mean, img_backbone.layer3.15.bn2.running_var, img_backbone.layer3.15.conv3.weight, img_backbone.layer3.15.bn3.weight, img_backbone.layer3.15.bn3.bias, img_backbone.layer3.15.bn3.running_mean, img_backbone.layer3.15.bn3.running_var, img_backbone.layer3.16.conv1.weight, img_backbone.layer3.16.bn1.weight, img_backbone.layer3.16.bn1.bias, img_backbone.layer3.16.bn1.running_mean, img_backbone.layer3.16.bn1.running_var, img_backbone.layer3.16.conv2.weight, img_backbone.layer3.16.bn2.weight, img_backbone.layer3.16.bn2.bias, img_backbone.layer3.16.bn2.running_mean, img_backbone.layer3.16.bn2.running_var, img_backbone.layer3.16.conv3.weight, img_backbone.layer3.16.bn3.weight, img_backbone.layer3.16.bn3.bias, img_backbone.layer3.16.bn3.running_mean, img_backbone.layer3.16.bn3.running_var, img_backbone.layer3.17.conv1.weight, img_backbone.layer3.17.bn1.weight, img_backbone.layer3.17.bn1.bias, img_backbone.layer3.17.bn1.running_mean, img_backbone.layer3.17.bn1.running_var, img_backbone.layer3.17.conv2.weight, img_backbone.layer3.17.bn2.weight, img_backbone.layer3.17.bn2.bias, img_backbone.layer3.17.bn2.running_mean, img_backbone.layer3.17.bn2.running_var, img_backbone.layer3.17.conv3.weight, img_backbone.layer3.17.bn3.weight, img_backbone.layer3.17.bn3.bias, img_backbone.layer3.17.bn3.running_mean, img_backbone.layer3.17.bn3.running_var, img_backbone.layer3.18.conv1.weight, img_backbone.layer3.18.bn1.weight, img_backbone.layer3.18.bn1.bias, img_backbone.layer3.18.bn1.running_mean, img_backbone.layer3.18.bn1.running_var, img_backbone.layer3.18.conv2.weight, img_backbone.layer3.18.bn2.weight, img_backbone.layer3.18.bn2.bias, img_backbone.layer3.18.bn2.running_mean, img_backbone.layer3.18.bn2.running_var, img_backbone.layer3.18.conv3.weight, img_backbone.layer3.18.bn3.weight, img_backbone.layer3.18.bn3.bias, img_backbone.layer3.18.bn3.running_mean, img_backbone.layer3.18.bn3.running_var, img_backbone.layer3.19.conv1.weight, img_backbone.layer3.19.bn1.weight, img_backbone.layer3.19.bn1.bias, img_backbone.layer3.19.bn1.running_mean, img_backbone.layer3.19.bn1.running_var, img_backbone.layer3.19.conv2.weight, img_backbone.layer3.19.bn2.weight, img_backbone.layer3.19.bn2.bias, img_backbone.layer3.19.bn2.running_mean, img_backbone.layer3.19.bn2.running_var, img_backbone.layer3.19.conv3.weight, img_backbone.layer3.19.bn3.weight, img_backbone.layer3.19.bn3.bias, img_backbone.layer3.19.bn3.running_mean, img_backbone.layer3.19.bn3.running_var, img_backbone.layer3.20.conv1.weight, img_backbone.layer3.20.bn1.weight, img_backbone.layer3.20.bn1.bias, img_backbone.layer3.20.bn1.running_mean, img_backbone.layer3.20.bn1.running_var, img_backbone.layer3.20.conv2.weight, img_backbone.layer3.20.bn2.weight, img_backbone.layer3.20.bn2.bias, img_backbone.layer3.20.bn2.running_mean, img_backbone.layer3.20.bn2.running_var, img_backbone.layer3.20.conv3.weight, img_backbone.layer3.20.bn3.weight, img_backbone.layer3.20.bn3.bias, img_backbone.layer3.20.bn3.running_mean, img_backbone.layer3.20.bn3.running_var, img_backbone.layer3.21.conv1.weight, img_backbone.layer3.21.bn1.weight, img_backbone.layer3.21.bn1.bias, img_backbone.layer3.21.bn1.running_mean, img_backbone.layer3.21.bn1.running_var, img_backbone.layer3.21.conv2.weight, img_backbone.layer3.21.bn2.weight, img_backbone.layer3.21.bn2.bias, img_backbone.layer3.21.bn2.running_mean, img_backbone.layer3.21.bn2.running_var, img_backbone.layer3.21.conv3.weight, img_backbone.layer3.21.bn3.weight, img_backbone.layer3.21.bn3.bias, img_backbone.layer3.21.bn3.running_mean, img_backbone.layer3.21.bn3.running_var, img_backbone.layer3.22.conv1.weight, img_backbone.layer3.22.bn1.weight, img_backbone.layer3.22.bn1.bias, img_backbone.layer3.22.bn1.running_mean, img_backbone.layer3.22.bn1.running_var, img_backbone.layer3.22.conv2.weight, img_backbone.layer3.22.bn2.weight, img_backbone.layer3.22.bn2.bias, img_backbone.layer3.22.bn2.running_mean, img_backbone.layer3.22.bn2.running_var, img_backbone.layer3.22.conv3.weight, img_backbone.layer3.22.bn3.weight, img_backbone.layer3.22.bn3.bias, img_backbone.layer3.22.bn3.running_mean, img_backbone.layer3.22.bn3.running_var, img_backbone.layer4.0.conv1.weight, img_backbone.layer4.0.bn1.weight, img_backbone.layer4.0.bn1.bias, img_backbone.layer4.0.bn1.running_mean, img_backbone.layer4.0.bn1.running_var, img_backbone.layer4.0.conv2.weight, img_backbone.layer4.0.bn2.weight, img_backbone.layer4.0.bn2.bias, img_backbone.layer4.0.bn2.running_mean, img_backbone.layer4.0.bn2.running_var, img_backbone.layer4.0.conv3.weight, img_backbone.layer4.0.bn3.weight, img_backbone.layer4.0.bn3.bias, img_backbone.layer4.0.bn3.running_mean, img_backbone.layer4.0.bn3.running_var, img_backbone.layer4.0.downsample.0.weight, img_backbone.layer4.0.downsample.1.weight, img_backbone.layer4.0.downsample.1.bias, img_backbone.layer4.0.downsample.1.running_mean, img_backbone.layer4.0.downsample.1.running_var, img_backbone.layer4.1.conv1.weight, img_backbone.layer4.1.bn1.weight, img_backbone.layer4.1.bn1.bias, img_backbone.layer4.1.bn1.running_mean, img_backbone.layer4.1.bn1.running_var, img_backbone.layer4.1.conv2.weight, img_backbone.layer4.1.bn2.weight, img_backbone.layer4.1.bn2.bias, img_backbone.layer4.1.bn2.running_mean, img_backbone.layer4.1.bn2.running_var, img_backbone.layer4.1.conv3.weight, img_backbone.layer4.1.bn3.weight, img_backbone.layer4.1.bn3.bias, img_backbone.layer4.1.bn3.running_mean, img_backbone.layer4.1.bn3.running_var, img_backbone.layer4.2.conv1.weight, img_backbone.layer4.2.bn1.weight, img_backbone.layer4.2.bn1.bias, img_backbone.layer4.2.bn1.running_mean, img_backbone.layer4.2.bn1.running_var, img_backbone.layer4.2.conv2.weight, img_backbone.layer4.2.bn2.weight, img_backbone.layer4.2.bn2.bias, img_backbone.layer4.2.bn2.running_mean, img_backbone.layer4.2.bn2.running_var, img_backbone.layer4.2.conv3.weight, img_backbone.layer4.2.bn3.weight, img_backbone.layer4.2.bn3.bias, img_backbone.layer4.2.bn3.running_mean, img_backbone.layer4.2.bn3.running_var, img_neck.lateral_convs.0.conv.weight, img_neck.lateral_convs.0.conv.bias, img_neck.lateral_convs.1.conv.weight, img_neck.lateral_convs.1.conv.bias, img_neck.lateral_convs.2.conv.weight, img_neck.lateral_convs.2.conv.bias, img_neck.lateral_convs.3.conv.weight, img_neck.lateral_convs.3.conv.bias, img_neck.fpn_convs.0.conv.weight, img_neck.fpn_convs.0.conv.bias, img_neck.fpn_convs.1.conv.weight, img_neck.fpn_convs.1.conv.bias, img_neck.fpn_convs.2.conv.weight, img_neck.fpn_convs.2.conv.bias, img_neck.fpn_convs.3.conv.weight, img_neck.fpn_convs.3.conv.bias, head.instance_bank.anchor, head.instance_bank.instance_feature, head.instance_bank.anchor_handler.fix_scale, head.anchor_encoder.pos_fc.0.weight, head.anchor_encoder.pos_fc.0.bias, head.anchor_encoder.pos_fc.2.weight, head.anchor_encoder.pos_fc.2.bias, head.anchor_encoder.pos_fc.3.weight, head.anchor_encoder.pos_fc.3.bias, head.anchor_encoder.pos_fc.5.weight, head.anchor_encoder.pos_fc.5.bias, head.anchor_encoder.pos_fc.6.weight, head.anchor_encoder.pos_fc.6.bias, head.anchor_encoder.pos_fc.8.weight, head.anchor_encoder.pos_fc.8.bias, head.anchor_encoder.pos_fc.9.weight, head.anchor_encoder.pos_fc.9.bias, head.anchor_encoder.pos_fc.11.weight, head.anchor_encoder.pos_fc.11.bias, head.anchor_encoder.size_fc.0.weight, head.anchor_encoder.size_fc.0.bias, head.anchor_encoder.size_fc.2.weight, head.anchor_encoder.size_fc.2.bias, head.anchor_encoder.size_fc.3.weight, head.anchor_encoder.size_fc.3.bias, head.anchor_encoder.size_fc.5.weight, head.anchor_encoder.size_fc.5.bias, head.anchor_encoder.size_fc.6.weight, head.anchor_encoder.size_fc.6.bias, head.anchor_encoder.size_fc.8.weight, head.anchor_encoder.size_fc.8.bias, head.anchor_encoder.size_fc.9.weight, head.anchor_encoder.size_fc.9.bias, head.anchor_encoder.size_fc.11.weight, head.anchor_encoder.size_fc.11.bias, head.anchor_encoder.yaw_fc.0.weight, head.anchor_encoder.yaw_fc.0.bias, head.anchor_encoder.yaw_fc.2.weight, head.anchor_encoder.yaw_fc.2.bias, head.anchor_encoder.yaw_fc.3.weight, head.anchor_encoder.yaw_fc.3.bias, head.anchor_encoder.yaw_fc.5.weight, head.anchor_encoder.yaw_fc.5.bias, head.anchor_encoder.yaw_fc.6.weight, head.anchor_encoder.yaw_fc.6.bias, head.anchor_encoder.yaw_fc.8.weight, head.anchor_encoder.yaw_fc.8.bias, head.anchor_encoder.yaw_fc.9.weight, head.anchor_encoder.yaw_fc.9.bias, head.anchor_encoder.yaw_fc.11.weight, head.anchor_encoder.yaw_fc.11.bias, head.anchor_encoder.vel_fc.0.weight, head.anchor_encoder.vel_fc.0.bias, head.anchor_encoder.vel_fc.2.weight, head.anchor_encoder.vel_fc.2.bias, head.anchor_encoder.vel_fc.3.weight, head.anchor_encoder.vel_fc.3.bias, head.anchor_encoder.vel_fc.5.weight, head.anchor_encoder.vel_fc.5.bias, head.anchor_encoder.vel_fc.6.weight, head.anchor_encoder.vel_fc.6.bias, head.anchor_encoder.vel_fc.8.weight, head.anchor_encoder.vel_fc.8.bias, head.anchor_encoder.vel_fc.9.weight, head.anchor_encoder.vel_fc.9.bias, head.anchor_encoder.vel_fc.11.weight, head.anchor_encoder.vel_fc.11.bias, head.layers.0.kps_generator.fix_scale, head.layers.0.kps_generator.learnable_fc.weight, head.layers.0.kps_generator.learnable_fc.bias, head.layers.0.output_proj.weight, head.layers.0.output_proj.bias, head.layers.0.camera_encoder.0.weight, head.layers.0.camera_encoder.0.bias, head.layers.0.camera_encoder.2.weight, head.layers.0.camera_encoder.2.bias, head.layers.0.camera_encoder.3.weight, head.layers.0.camera_encoder.3.bias, head.layers.0.camera_encoder.5.weight, head.layers.0.camera_encoder.5.bias, head.layers.0.weights_fc.weight, head.layers.0.weights_fc.bias, head.layers.1.pre_norm.weight, head.layers.1.pre_norm.bias, head.layers.1.layers.0.0.weight, head.layers.1.layers.0.0.bias, head.layers.1.layers.1.weight, head.layers.1.layers.1.bias, head.layers.1.identity_fc.weight, head.layers.1.identity_fc.bias, head.layers.2.weight, head.layers.2.bias, head.layers.3.layers.0.weight, head.layers.3.layers.0.bias, head.layers.3.layers.2.weight, head.layers.3.layers.2.bias, head.layers.3.layers.4.weight, head.layers.3.layers.4.bias, head.layers.3.layers.5.weight, head.layers.3.layers.5.bias, head.layers.3.layers.7.weight, head.layers.3.layers.7.bias, head.layers.3.layers.9.weight, head.layers.3.layers.9.bias, head.layers.3.layers.10.weight, head.layers.3.layers.10.bias, head.layers.3.layers.11.scale, head.layers.3.cls_layers.0.weight, head.layers.3.cls_layers.0.bias, head.layers.3.cls_layers.2.weight, head.layers.3.cls_layers.2.bias, head.layers.3.cls_layers.3.weight, head.layers.3.cls_layers.3.bias, head.layers.3.cls_layers.5.weight, head.layers.3.cls_layers.5.bias, head.layers.3.cls_layers.6.weight, head.layers.3.cls_layers.6.bias, head.layers.3.quality_layers.0.weight, head.layers.3.quality_layers.0.bias, head.layers.3.quality_layers.2.weight, head.layers.3.quality_layers.2.bias, head.layers.3.quality_layers.3.weight, head.layers.3.quality_layers.3.bias, head.layers.3.quality_layers.5.weight, head.layers.3.quality_layers.5.bias, head.layers.3.quality_layers.6.weight, head.layers.3.quality_layers.6.bias, head.layers.4.attn.in_proj_weight, head.layers.4.attn.in_proj_bias, head.layers.4.attn.out_proj.weight, head.layers.4.attn.out_proj.bias, head.layers.5.attn.in_proj_weight, head.layers.5.attn.in_proj_bias, head.layers.5.attn.out_proj.weight, head.layers.5.attn.out_proj.bias, head.layers.6.weight, head.layers.6.bias, head.layers.7.kps_generator.fix_scale, head.layers.7.kps_generator.learnable_fc.weight, head.layers.7.kps_generator.learnable_fc.bias, head.layers.7.output_proj.weight, head.layers.7.output_proj.bias, head.layers.7.camera_encoder.0.weight, head.layers.7.camera_encoder.0.bias, head.layers.7.camera_encoder.2.weight, head.layers.7.camera_encoder.2.bias, head.layers.7.camera_encoder.3.weight, head.layers.7.camera_encoder.3.bias, head.layers.7.camera_encoder.5.weight, head.layers.7.camera_encoder.5.bias, head.layers.7.weights_fc.weight, head.layers.7.weights_fc.bias, head.layers.8.pre_norm.weight, head.layers.8.pre_norm.bias, head.layers.8.layers.0.0.weight, head.layers.8.layers.0.0.bias, head.layers.8.layers.1.weight, head.layers.8.layers.1.bias, head.layers.8.identity_fc.weight, head.layers.8.identity_fc.bias, head.layers.9.weight, head.layers.9.bias, head.layers.10.layers.0.weight, head.layers.10.layers.0.bias, head.layers.10.layers.2.weight, head.layers.10.layers.2.bias, head.layers.10.layers.4.weight, head.layers.10.layers.4.bias, head.layers.10.layers.5.weight, head.layers.10.layers.5.bias, head.layers.10.layers.7.weight, head.layers.10.layers.7.bias, head.layers.10.layers.9.weight, head.layers.10.layers.9.bias, head.layers.10.layers.10.weight, head.layers.10.layers.10.bias, head.layers.10.layers.11.scale, head.layers.10.cls_layers.0.weight, head.layers.10.cls_layers.0.bias, head.layers.10.cls_layers.2.weight, head.layers.10.cls_layers.2.bias, head.layers.10.cls_layers.3.weight, head.layers.10.cls_layers.3.bias, head.layers.10.cls_layers.5.weight, head.layers.10.cls_layers.5.bias, head.layers.10.cls_layers.6.weight, head.layers.10.cls_layers.6.bias, head.layers.10.quality_layers.0.weight, head.layers.10.quality_layers.0.bias, head.layers.10.quality_layers.2.weight, head.layers.10.quality_layers.2.bias, head.layers.10.quality_layers.3.weight, head.layers.10.quality_layers.3.bias, head.layers.10.quality_layers.5.weight, head.layers.10.quality_layers.5.bias, head.layers.10.quality_layers.6.weight, head.layers.10.quality_layers.6.bias, head.layers.11.attn.in_proj_weight, head.layers.11.attn.in_proj_bias, head.layers.11.attn.out_proj.weight, head.layers.11.attn.out_proj.bias, head.layers.12.attn.in_proj_weight, head.layers.12.attn.in_proj_bias, head.layers.12.attn.out_proj.weight, head.layers.12.attn.out_proj.bias, head.layers.13.weight, head.layers.13.bias, head.layers.14.kps_generator.fix_scale, head.layers.14.kps_generator.learnable_fc.weight, head.layers.14.kps_generator.learnable_fc.bias, head.layers.14.output_proj.weight, head.layers.14.output_proj.bias, head.layers.14.camera_encoder.0.weight, head.layers.14.camera_encoder.0.bias, head.layers.14.camera_encoder.2.weight, head.layers.14.camera_encoder.2.bias, head.layers.14.camera_encoder.3.weight, head.layers.14.camera_encoder.3.bias, head.layers.14.camera_encoder.5.weight, head.layers.14.camera_encoder.5.bias, head.layers.14.weights_fc.weight, head.layers.14.weights_fc.bias, head.layers.15.pre_norm.weight, head.layers.15.pre_norm.bias, head.layers.15.layers.0.0.weight, head.layers.15.layers.0.0.bias, head.layers.15.layers.1.weight, head.layers.15.layers.1.bias, head.layers.15.identity_fc.weight, head.layers.15.identity_fc.bias, head.layers.16.weight, head.layers.16.bias, head.layers.17.layers.0.weight, head.layers.17.layers.0.bias, head.layers.17.layers.2.weight, head.layers.17.layers.2.bias, head.layers.17.layers.4.weight, head.layers.17.layers.4.bias, head.layers.17.layers.5.weight, head.layers.17.layers.5.bias, head.layers.17.layers.7.weight, head.layers.17.layers.7.bias, head.layers.17.layers.9.weight, head.layers.17.layers.9.bias, head.layers.17.layers.10.weight, head.layers.17.layers.10.bias, head.layers.17.layers.11.scale, head.layers.17.cls_layers.0.weight, head.layers.17.cls_layers.0.bias, head.layers.17.cls_layers.2.weight, head.layers.17.cls_layers.2.bias, head.layers.17.cls_layers.3.weight, head.layers.17.cls_layers.3.bias, head.layers.17.cls_layers.5.weight, head.layers.17.cls_layers.5.bias, head.layers.17.cls_layers.6.weight, head.layers.17.cls_layers.6.bias, head.layers.17.quality_layers.0.weight, head.layers.17.quality_layers.0.bias, head.layers.17.quality_layers.2.weight, head.layers.17.quality_layers.2.bias, head.layers.17.quality_layers.3.weight, head.layers.17.quality_layers.3.bias, head.layers.17.quality_layers.5.weight, head.layers.17.quality_layers.5.bias, head.layers.17.quality_layers.6.weight, head.layers.17.quality_layers.6.bias, head.layers.18.attn.in_proj_weight, head.layers.18.attn.in_proj_bias, head.layers.18.attn.out_proj.weight, head.layers.18.attn.out_proj.bias, head.layers.19.attn.in_proj_weight, head.layers.19.attn.in_proj_bias, head.layers.19.attn.out_proj.weight, head.layers.19.attn.out_proj.bias, head.layers.20.weight, head.layers.20.bias, head.layers.21.kps_generator.fix_scale, head.layers.21.kps_generator.learnable_fc.weight, head.layers.21.kps_generator.learnable_fc.bias, head.layers.21.output_proj.weight, head.layers.21.output_proj.bias, head.layers.21.camera_encoder.0.weight, head.layers.21.camera_encoder.0.bias, head.layers.21.camera_encoder.2.weight, head.layers.21.camera_encoder.2.bias, head.layers.21.camera_encoder.3.weight, head.layers.21.camera_encoder.3.bias, head.layers.21.camera_encoder.5.weight, head.layers.21.camera_encoder.5.bias, head.layers.21.weights_fc.weight, head.layers.21.weights_fc.bias, head.layers.22.pre_norm.weight, head.layers.22.pre_norm.bias, head.layers.22.layers.0.0.weight, head.layers.22.layers.0.0.bias, head.layers.22.layers.1.weight, head.layers.22.layers.1.bias, head.layers.22.identity_fc.weight, head.layers.22.identity_fc.bias, head.layers.23.weight, head.layers.23.bias, head.layers.24.layers.0.weight, head.layers.24.layers.0.bias, head.layers.24.layers.2.weight, head.layers.24.layers.2.bias, head.layers.24.layers.4.weight, head.layers.24.layers.4.bias, head.layers.24.layers.5.weight, head.layers.24.layers.5.bias, head.layers.24.layers.7.weight, head.layers.24.layers.7.bias, head.layers.24.layers.9.weight, head.layers.24.layers.9.bias, head.layers.24.layers.10.weight, head.layers.24.layers.10.bias, head.layers.24.layers.11.scale, head.layers.24.cls_layers.0.weight, head.layers.24.cls_layers.0.bias, head.layers.24.cls_layers.2.weight, head.layers.24.cls_layers.2.bias, head.layers.24.cls_layers.3.weight, head.layers.24.cls_layers.3.bias, head.layers.24.cls_layers.5.weight, head.layers.24.cls_layers.5.bias, head.layers.24.cls_layers.6.weight, head.layers.24.cls_layers.6.bias, head.layers.24.quality_layers.0.weight, head.layers.24.quality_layers.0.bias, head.layers.24.quality_layers.2.weight, head.layers.24.quality_layers.2.bias, head.layers.24.quality_layers.3.weight, head.layers.24.quality_layers.3.bias, head.layers.24.quality_layers.5.weight, head.layers.24.quality_layers.5.bias, head.layers.24.quality_layers.6.weight, head.layers.24.quality_layers.6.bias, head.layers.25.attn.in_proj_weight, head.layers.25.attn.in_proj_bias, head.layers.25.attn.out_proj.weight, head.layers.25.attn.out_proj.bias, head.layers.26.attn.in_proj_weight, head.layers.26.attn.in_proj_bias, head.layers.26.attn.out_proj.weight, head.layers.26.attn.out_proj.bias, head.layers.27.weight, head.layers.27.bias, head.layers.28.kps_generator.fix_scale, head.layers.28.kps_generator.learnable_fc.weight, head.layers.28.kps_generator.learnable_fc.bias, head.layers.28.output_proj.weight, head.layers.28.output_proj.bias, head.layers.28.camera_encoder.0.weight, head.layers.28.camera_encoder.0.bias, head.layers.28.camera_encoder.2.weight, head.layers.28.camera_encoder.2.bias, head.layers.28.camera_encoder.3.weight, head.layers.28.camera_encoder.3.bias, head.layers.28.camera_encoder.5.weight, head.layers.28.camera_encoder.5.bias, head.layers.28.weights_fc.weight, head.layers.28.weights_fc.bias, head.layers.29.pre_norm.weight, head.layers.29.pre_norm.bias, head.layers.29.layers.0.0.weight, head.layers.29.layers.0.0.bias, head.layers.29.layers.1.weight, head.layers.29.layers.1.bias, head.layers.29.identity_fc.weight, head.layers.29.identity_fc.bias, head.layers.30.weight, head.layers.30.bias, head.layers.31.layers.0.weight, head.layers.31.layers.0.bias, head.layers.31.layers.2.weight, head.layers.31.layers.2.bias, head.layers.31.layers.4.weight, head.layers.31.layers.4.bias, head.layers.31.layers.5.weight, head.layers.31.layers.5.bias, head.layers.31.layers.7.weight, head.layers.31.layers.7.bias, head.layers.31.layers.9.weight, head.layers.31.layers.9.bias, head.layers.31.layers.10.weight, head.layers.31.layers.10.bias, head.layers.31.layers.11.scale, head.layers.31.cls_layers.0.weight, head.layers.31.cls_layers.0.bias, head.layers.31.cls_layers.2.weight, head.layers.31.cls_layers.2.bias, head.layers.31.cls_layers.3.weight, head.layers.31.cls_layers.3.bias, head.layers.31.cls_layers.5.weight, head.layers.31.cls_layers.5.bias, head.layers.31.cls_layers.6.weight, head.layers.31.cls_layers.6.bias, head.layers.31.quality_layers.0.weight, head.layers.31.quality_layers.0.bias, head.layers.31.quality_layers.2.weight, head.layers.31.quality_layers.2.bias, head.layers.31.quality_layers.3.weight, head.layers.31.quality_layers.3.bias, head.layers.31.quality_layers.5.weight, head.layers.31.quality_layers.5.bias, head.layers.31.quality_layers.6.weight, head.layers.31.quality_layers.6.bias, head.layers.32.attn.in_proj_weight, head.layers.32.attn.in_proj_bias, head.layers.32.attn.out_proj.weight, head.layers.32.attn.out_proj.bias, head.layers.33.attn.in_proj_weight, head.layers.33.attn.in_proj_bias, head.layers.33.attn.out_proj.weight, head.layers.33.attn.out_proj.bias, head.layers.34.weight, head.layers.34.bias, head.layers.35.kps_generator.fix_scale, head.layers.35.kps_generator.learnable_fc.weight, head.layers.35.kps_generator.learnable_fc.bias, head.layers.35.output_proj.weight, head.layers.35.output_proj.bias, head.layers.35.camera_encoder.0.weight, head.layers.35.camera_encoder.0.bias, head.layers.35.camera_encoder.2.weight, head.layers.35.camera_encoder.2.bias, head.layers.35.camera_encoder.3.weight, head.layers.35.camera_encoder.3.bias, head.layers.35.camera_encoder.5.weight, head.layers.35.camera_encoder.5.bias, head.layers.35.weights_fc.weight, head.layers.35.weights_fc.bias, head.layers.36.pre_norm.weight, head.layers.36.pre_norm.bias, head.layers.36.layers.0.0.weight, head.layers.36.layers.0.0.bias, head.layers.36.layers.1.weight, head.layers.36.layers.1.bias, head.layers.36.identity_fc.weight, head.layers.36.identity_fc.bias, head.layers.37.weight, head.layers.37.bias, head.layers.38.layers.0.weight, head.layers.38.layers.0.bias, head.layers.38.layers.2.weight, head.layers.38.layers.2.bias, head.layers.38.layers.4.weight, head.layers.38.layers.4.bias, head.layers.38.layers.5.weight, head.layers.38.layers.5.bias, head.layers.38.layers.7.weight, head.layers.38.layers.7.bias, head.layers.38.layers.9.weight, head.layers.38.layers.9.bias, head.layers.38.layers.10.weight, head.layers.38.layers.10.bias, head.layers.38.layers.11.scale, head.layers.38.cls_layers.0.weight, head.layers.38.cls_layers.0.bias, head.layers.38.cls_layers.2.weight, head.layers.38.cls_layers.2.bias, head.layers.38.cls_layers.3.weight, head.layers.38.cls_layers.3.bias, head.layers.38.cls_layers.5.weight, head.layers.38.cls_layers.5.bias, head.layers.38.cls_layers.6.weight, head.layers.38.cls_layers.6.bias, head.layers.38.quality_layers.0.weight, head.layers.38.quality_layers.0.bias, head.layers.38.quality_layers.2.weight, head.layers.38.quality_layers.2.bias, head.layers.38.quality_layers.3.weight, head.layers.38.quality_layers.3.bias, head.layers.38.quality_layers.5.weight, head.layers.38.quality_layers.5.bias, head.layers.38.quality_layers.6.weight, head.layers.38.quality_layers.6.bias, head.fc_before.weight, head.fc_after.weight, depth_branch.depth_layers.0.weight, depth_branch.depth_layers.0.bias, depth_branch.depth_layers.1.weight, depth_branch.depth_layers.1.bias, depth_branch.depth_layers.2.weight, depth_branch.depth_layers.2.bias

mAP: 0.0861 mATE: 0.9609 mASE: 0.3135 mAOE: 0.8157 mAVE: 0.7097 mAAE: 0.2931 NDS: 0.2337 Eval time: 166.2s

Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.285 0.720 0.170 0.382 0.564 0.198 truck 0.014 1.006 0.248 0.647 0.706 0.212 bus 0.003 1.133 0.238 0.268 1.974 0.398 trailer 0.000 1.342 0.299 1.145 0.508 0.150 construction_vehicle 0.000 1.230 0.595 1.395 0.108 0.594 pedestrian 0.124 0.893 0.302 1.130 0.723 0.519 motorcycle 0.028 0.922 0.316 0.845 0.892 0.263 bicycle 0.017 0.784 0.346 1.318 0.202 0.012 traffic_cone 0.234 0.688 0.345 nan nan nan barrier 0.157 0.890 0.277 0.211 nan nan 请问,这个原因是因为cascade mask-rcnn权重的问题吗? 这是我的log文件 20240306_185418.log

linxuewu commented 7 months ago

权重名称没对齐,config应该这样写。

load_from = None img_backbone=dict( type='ResNet', depth=101, num_stages=4, frozen_stages=-1, style='pytorch', with_cp=True, out_indices=(0, 1, 2, 3), norm_eval=True, norm_cfg=dict(type='BN', requires_grad=False), init_cfg=dict( type='Pretrained', checkpoint= 'ckpt/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth', prefix='backbone.')),