open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.56k stars 598 forks source link

Use given Faster-RCNN as the detector of ocsort, but report error "The model and loaded state dict do not match exactly" #873

Open susanbao opened 1 year ago

susanbao commented 1 year ago

Notice

There are several common situations in the reimplementation issues as below

  1. Reimplement a model in the model zoo using the provided configs
  2. Reimplement a model in the model zoo on other dataset (e.g., custom datasets)
  3. Reimplement a custom model but all the components are implemented in MMTracking
  4. Reimplement a custom model with new modules implemented by yourself

There are several things to do for different cases as below.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The issue has not been fixed in the latest version.

Describe the issue

I write a config file that use the given Faster-RCNN model as the detector of ocsort, but meet the error "The model and loaded state dict do not match exactly"

Reproduction

  1. What command or script did you run?

python demo/demo_mot_vis.py \ configs/mot/ocsort/ocsort_rcnn_x_crowdhuman_mot17-private-half.py \ --checkpoint 'http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_2x_coco/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth' \ --input demo/demo.mp4 \ --output demo/mot_ocsort_rcnn.mp4 \

  1. What config dir you run?

base = [ '../../base/models/faster_rcnn_r50_fpn.py', '../../base/datasets/mot_challenge.py', '../../base/default_runtime.py' ]

img_scale = (800, 1440) samples_per_gpu = 4

model = dict( type='OCSORT', detector=dict( rpn_head=dict(bbox_coder=dict(clip_border=False)), roi_head=dict( bbox_head=dict(bbox_coder=dict(clip_border=False), num_classes=1)), init_cfg=dict( type='Pretrained', checkpoint= # noqa: E251 'https://download.openmmlab.com/mmtracking/mot/faster_rcnn/faster-rcnn_r50_fpn_4e_mot17-half-64ee2ed4.pth' # noqa: E501 )), motion=dict(type='KalmanFilter'), tracker=dict( type='OCSORTTracker', obj_score_thr=0.3, init_track_thr=0.7, weight_iou_with_det_scores=True, match_iou_thr=0.3, num_tentatives=3, vel_consist_weight=0.2, vel_delta_t=3, num_frames_retain=30))

train_pipeline = [ dict( type='Mosaic', img_scale=img_scale, pad_val=114.0, bbox_clip_border=False), dict( type='RandomAffine', scaling_ratio_range=(0.1, 2), border=(-img_scale[0] // 2, -img_scale[1] // 2), bbox_clip_border=False), dict( type='MixUp', img_scale=img_scale, ratio_range=(0.8, 1.6), pad_val=114.0, bbox_clip_border=False),

dict(type='YOLOXHSVRandomAug'),

dict(type='RandomFlip', flip_ratio=0.5),
dict(
    type='Resize',
    img_scale=img_scale,
    keep_ratio=True,
    bbox_clip_border=False),
dict(type='Pad', size_divisor=32, pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='FilterAnnotations', min_gt_bbox_wh=(1, 1), keep_empty=False),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])

]

test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=img_scale, flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[0.0, 0.0, 0.0], std=[1.0, 1.0, 1.0], to_rgb=False), dict( type='Pad', size_divisor=32, pad_val=dict(img=(114.0, 114.0, 114.0))), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ] data = dict( samples_per_gpu=samples_per_gpu, workers_per_gpu=4, persistent_workers=True, train=dict( delete=True, type='MultiImageMixDataset', dataset=dict( type='CocoDataset', ann_file=[ 'data/MOT17/annotations/half-train_cocoformat.json', 'data/crowdhuman/annotations/crowdhuman_train.json', 'data/crowdhuman/annotations/crowdhuman_val.json' ], img_prefix=[ 'data/MOT17/train', 'data/crowdhuman/train', 'data/crowdhuman/val' ], classes=('pedestrian', ), pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True) ], filter_empty_gt=False), pipeline=train_pipeline), val=dict( pipeline=test_pipeline, interpolate_tracks_cfg=dict(min_num_frames=5, max_num_frames=20)), test=dict( pipeline=test_pipeline, interpolate_tracks_cfg=dict(min_num_frames=5, max_num_frames=20)))

optimizer

default 8 gpu

optimizer = dict( type='SGD', lr=0.001 / 8 * samples_per_gpu, momentum=0.9, weight_decay=5e-4, nesterov=True, paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0)) optimizer_config = dict(grad_clip=None)

some hyper parameters

total_epochs = 80 num_last_epochs = 10 resume_from = None interval = 5

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=100, warmup_ratio=1.0 / 100, step=[3])

custom_hooks = [ dict( type='YOLOXModeSwitchHook', num_last_epochs=num_last_epochs, priority=48), dict( type='SyncNormHook', num_last_epochs=num_last_epochs, interval=interval, priority=48), dict( type='ExpMomentumEMAHook', resume_from=resume_from, momentum=0.0001, priority=49) ]

checkpoint_config = dict(interval=1) evaluation = dict(metric=['bbox', 'track'], interval=1) search_metrics = ['MOTA', 'IDF1', 'FN', 'FP', 'IDs', 'MT', 'ML']

you need to set mode='dynamic' if you are using pytorch<=1.5.0

fp16 = dict(loss_scale=dict(init_scale=512.))

  1. Did you make any modifications on the code or config? Did you understand what you have modified?

I mainly change the detector model from yolo to Faster-RCNN, and use the given checkpoint. I am not sure it works.

  1. What dataset did you use?

MOT17

Environment

  1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here. sys.platform: linux Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0] CUDA available: True GPU 0,1,2,3: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: /home/sas20048/tool/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.10 GCC: gcc (GCC) 5.4.0 PyTorch: 1.6.0 PyTorch compiling details: PyTorch built with:
    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
    • CuDNN 7.6.3
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.7.0 OpenCV: 4.7.0 MMCV: 1.7.1 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMTracking: 0.14.0+305da14

  1. You may add addition that may be helpful for locating the problem, such as
    1. How you installed PyTorch [e.g., pip, conda, source]
    2. Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Results

If applicable, paste the related results here, e.g., what you expect and what you get.

The problem is that the checkpoint only contain variables without "detector" prefix. But the model we build has variables with "detector" prefix for ocsort. I also check that the model for deepsort also has variables with "detector" prefix but there is no problem to use the corresponding checkpoint with deepsort. Could you please tell me why?

Report error:

The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.bn2.num_batches_tracked, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.1.bn3.num_batches_tracked, backbone.layer2.2.conv1.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.bn1.num_batches_tracked, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.bn2.num_batches_tracked, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.2.bn3.num_batches_tracked, backbone.layer2.3.conv1.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.bn1.num_batches_tracked, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.bn2.num_batches_tracked, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer2.3.bn3.num_batches_tracked, backbone.layer3.0.conv1.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.bn1.num_batches_tracked, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.bn2.num_batches_tracked, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.bn3.num_batches_tracked, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.0.downsample.1.num_batches_tracked, backbone.layer3.1.conv1.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.bn1.num_batches_tracked, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.bn2.num_batches_tracked, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.1.bn3.num_batches_tracked, backbone.layer3.2.conv1.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.bn1.num_batches_tracked, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.bn2.num_batches_tracked, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.2.bn3.num_batches_tracked, backbone.layer3.3.conv1.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.bn1.num_batches_tracked, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.bn2.num_batches_tracked, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.3.bn3.num_batches_tracked, backbone.layer3.4.conv1.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.bn1.num_batches_tracked, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.bn2.num_batches_tracked, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.4.bn3.num_batches_tracked, backbone.layer3.5.conv1.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.bn1.num_batches_tracked, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.bn2.num_batches_tracked, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer3.5.bn3.num_batches_tracked, backbone.layer4.0.conv1.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.bn1.num_batches_tracked, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.bn2.num_batches_tracked, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.bn3.num_batches_tracked, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.0.downsample.1.num_batches_tracked, backbone.layer4.1.conv1.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.bn1.num_batches_tracked, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.bn2.num_batches_tracked, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.1.bn3.num_batches_tracked, backbone.layer4.2.conv1.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.bn1.num_batches_tracked, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.bn2.num_batches_tracked, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var, backbone.layer4.2.bn3.num_batches_tracked, neck.lateral_convs.0.conv.weight, neck.lateral_convs.0.conv.bias, neck.lateral_convs.1.conv.weight, neck.lateral_convs.1.conv.bias, neck.lateral_convs.2.conv.weight, neck.lateral_convs.2.conv.bias, neck.lateral_convs.3.conv.weight, neck.lateral_convs.3.conv.bias, neck.fpn_convs.0.conv.weight, neck.fpn_convs.0.conv.bias, neck.fpn_convs.1.conv.weight, neck.fpn_convs.1.conv.bias, neck.fpn_convs.2.conv.weight, neck.fpn_convs.2.conv.bias, neck.fpn_convs.3.conv.weight, neck.fpn_convs.3.conv.bias, rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias, roi_head.bbox_head.fc_cls.weight, roi_head.bbox_head.fc_cls.bias, roi_head.bbox_head.fc_reg.weight, roi_head.bbox_head.fc_reg.bias, roi_head.bbox_head.shared_fcs.0.weight, roi_head.bbox_head.shared_fcs.0.bias, roi_head.bbox_head.shared_fcs.1.weight, roi_head.bbox_head.shared_fcs.1.bias

missing keys in source state_dict: detector.backbone.conv1.weight, detector.backbone.bn1.weight, detector.backbone.bn1.bias, detector.backbone.bn1.running_mean, detector.backbone.bn1.running_var, detector.backbone.layer1.0.conv1.weight, detector.backbone.layer1.0.bn1.weight, detector.backbone.layer1.0.bn1.bias, detector.backbone.layer1.0.bn1.running_mean, detector.backbone.layer1.0.bn1.running_var, detector.backbone.layer1.0.conv2.weight, detector.backbone.layer1.0.bn2.weight, detector.backbone.layer1.0.bn2.bias, detector.backbone.layer1.0.bn2.running_mean, detector.backbone.layer1.0.bn2.running_var, detector.backbone.layer1.0.conv3.weight, detector.backbone.layer1.0.bn3.weight, detector.backbone.layer1.0.bn3.bias, detector.backbone.layer1.0.bn3.running_mean, detector.backbone.layer1.0.bn3.running_var, detector.backbone.layer1.0.downsample.0.weight, detector.backbone.layer1.0.downsample.1.weight, detector.backbone.layer1.0.downsample.1.bias, detector.backbone.layer1.0.downsample.1.running_mean, detector.backbone.layer1.0.downsample.1.running_var, detector.backbone.layer1.1.conv1.weight, detector.backbone.layer1.1.bn1.weight, detector.backbone.layer1.1.bn1.bias, detector.backbone.layer1.1.bn1.running_mean, detector.backbone.layer1.1.bn1.running_var, detector.backbone.layer1.1.conv2.weight, detector.backbone.layer1.1.bn2.weight, detector.backbone.layer1.1.bn2.bias, detector.backbone.layer1.1.bn2.running_mean, detector.backbone.layer1.1.bn2.running_var, detector.backbone.layer1.1.conv3.weight, detector.backbone.layer1.1.bn3.weight, detector.backbone.layer1.1.bn3.bias, detector.backbone.layer1.1.bn3.running_mean, detector.backbone.layer1.1.bn3.running_var, detector.backbone.layer1.2.conv1.weight, detector.backbone.layer1.2.bn1.weight, detector.backbone.layer1.2.bn1.bias, detector.backbone.layer1.2.bn1.running_mean, detector.backbone.layer1.2.bn1.running_var, detector.backbone.layer1.2.conv2.weight, detector.backbone.layer1.2.bn2.weight, detector.backbone.layer1.2.bn2.bias, detector.backbone.layer1.2.bn2.running_mean, detector.backbone.layer1.2.bn2.running_var, detector.backbone.layer1.2.conv3.weight, detector.backbone.layer1.2.bn3.weight, detector.backbone.layer1.2.bn3.bias, detector.backbone.layer1.2.bn3.running_mean, detector.backbone.layer1.2.bn3.running_var, detector.backbone.layer2.0.conv1.weight, detector.backbone.layer2.0.bn1.weight, detector.backbone.layer2.0.bn1.bias, detector.backbone.layer2.0.bn1.running_mean, detector.backbone.layer2.0.bn1.running_var, detector.backbone.layer2.0.conv2.weight, detector.backbone.layer2.0.bn2.weight, detector.backbone.layer2.0.bn2.bias, detector.backbone.layer2.0.bn2.running_mean, detector.backbone.layer2.0.bn2.running_var, detector.backbone.layer2.0.conv3.weight, detector.backbone.layer2.0.bn3.weight, detector.backbone.layer2.0.bn3.bias, detector.backbone.layer2.0.bn3.running_mean, detector.backbone.layer2.0.bn3.running_var, detector.backbone.layer2.0.downsample.0.weight, detector.backbone.layer2.0.downsample.1.weight, detector.backbone.layer2.0.downsample.1.bias, detector.backbone.layer2.0.downsample.1.running_mean, detector.backbone.layer2.0.downsample.1.running_var, detector.backbone.layer2.1.conv1.weight, detector.backbone.layer2.1.bn1.weight, detector.backbone.layer2.1.bn1.bias, detector.backbone.layer2.1.bn1.running_mean, detector.backbone.layer2.1.bn1.running_var, detector.backbone.layer2.1.conv2.weight, detector.backbone.layer2.1.bn2.weight, detector.backbone.layer2.1.bn2.bias, detector.backbone.layer2.1.bn2.running_mean, detector.backbone.layer2.1.bn2.running_var, detector.backbone.layer2.1.conv3.weight, detector.backbone.layer2.1.bn3.weight, detector.backbone.layer2.1.bn3.bias, detector.backbone.layer2.1.bn3.running_mean, detector.backbone.layer2.1.bn3.running_var, detector.backbone.layer2.2.conv1.weight, detector.backbone.layer2.2.bn1.weight, detector.backbone.layer2.2.bn1.bias, detector.backbone.layer2.2.bn1.running_mean, detector.backbone.layer2.2.bn1.running_var, detector.backbone.layer2.2.conv2.weight, detector.backbone.layer2.2.bn2.weight, detector.backbone.layer2.2.bn2.bias, detector.backbone.layer2.2.bn2.running_mean, detector.backbone.layer2.2.bn2.running_var, detector.backbone.layer2.2.conv3.weight, detector.backbone.layer2.2.bn3.weight, detector.backbone.layer2.2.bn3.bias, detector.backbone.layer2.2.bn3.running_mean, detector.backbone.layer2.2.bn3.running_var, detector.backbone.layer2.3.conv1.weight, detector.backbone.layer2.3.bn1.weight, detector.backbone.layer2.3.bn1.bias, detector.backbone.layer2.3.bn1.running_mean, detector.backbone.layer2.3.bn1.running_var, detector.backbone.layer2.3.conv2.weight, detector.backbone.layer2.3.bn2.weight, detector.backbone.layer2.3.bn2.bias, detector.backbone.layer2.3.bn2.running_mean, detector.backbone.layer2.3.bn2.running_var, detector.backbone.layer2.3.conv3.weight, detector.backbone.layer2.3.bn3.weight, detector.backbone.layer2.3.bn3.bias, detector.backbone.layer2.3.bn3.running_mean, detector.backbone.layer2.3.bn3.running_var, detector.backbone.layer3.0.conv1.weight, detector.backbone.layer3.0.bn1.weight, detector.backbone.layer3.0.bn1.bias, detector.backbone.layer3.0.bn1.running_mean, detector.backbone.layer3.0.bn1.running_var, detector.backbone.layer3.0.conv2.weight, detector.backbone.layer3.0.bn2.weight, detector.backbone.layer3.0.bn2.bias, detector.backbone.layer3.0.bn2.running_mean, detector.backbone.layer3.0.bn2.running_var, detector.backbone.layer3.0.conv3.weight, detector.backbone.layer3.0.bn3.weight, detector.backbone.layer3.0.bn3.bias, detector.backbone.layer3.0.bn3.running_mean, detector.backbone.layer3.0.bn3.running_var, detector.backbone.layer3.0.downsample.0.weight, detector.backbone.layer3.0.downsample.1.weight, detector.backbone.layer3.0.downsample.1.bias, detector.backbone.layer3.0.downsample.1.running_mean, detector.backbone.layer3.0.downsample.1.running_var, detector.backbone.layer3.1.conv1.weight, detector.backbone.layer3.1.bn1.weight, detector.backbone.layer3.1.bn1.bias, detector.backbone.layer3.1.bn1.running_mean, detector.backbone.layer3.1.bn1.running_var, detector.backbone.layer3.1.conv2.weight, detector.backbone.layer3.1.bn2.weight, detector.backbone.layer3.1.bn2.bias, detector.backbone.layer3.1.bn2.running_mean, detector.backbone.layer3.1.bn2.running_var, detector.backbone.layer3.1.conv3.weight, detector.backbone.layer3.1.bn3.weight, detector.backbone.layer3.1.bn3.bias, detector.backbone.layer3.1.bn3.running_mean, detector.backbone.layer3.1.bn3.running_var, detector.backbone.layer3.2.conv1.weight, detector.backbone.layer3.2.bn1.weight, detector.backbone.layer3.2.bn1.bias, detector.backbone.layer3.2.bn1.running_mean, detector.backbone.layer3.2.bn1.running_var, detector.backbone.layer3.2.conv2.weight, detector.backbone.layer3.2.bn2.weight, detector.backbone.layer3.2.bn2.bias, detector.backbone.layer3.2.bn2.running_mean, detector.backbone.layer3.2.bn2.running_var, detector.backbone.layer3.2.conv3.weight, detector.backbone.layer3.2.bn3.weight, detector.backbone.layer3.2.bn3.bias, detector.backbone.layer3.2.bn3.running_mean, detector.backbone.layer3.2.bn3.running_var, detector.backbone.layer3.3.conv1.weight, detector.backbone.layer3.3.bn1.weight, detector.backbone.layer3.3.bn1.bias, detector.backbone.layer3.3.bn1.running_mean, detector.backbone.layer3.3.bn1.running_var, detector.backbone.layer3.3.conv2.weight, detector.backbone.layer3.3.bn2.weight, detector.backbone.layer3.3.bn2.bias, detector.backbone.layer3.3.bn2.running_mean, detector.backbone.layer3.3.bn2.running_var, detector.backbone.layer3.3.conv3.weight, detector.backbone.layer3.3.bn3.weight, detector.backbone.layer3.3.bn3.bias, detector.backbone.layer3.3.bn3.running_mean, detector.backbone.layer3.3.bn3.running_var, detector.backbone.layer3.4.conv1.weight, detector.backbone.layer3.4.bn1.weight, detector.backbone.layer3.4.bn1.bias, detector.backbone.layer3.4.bn1.running_mean, detector.backbone.layer3.4.bn1.running_var, detector.backbone.layer3.4.conv2.weight, detector.backbone.layer3.4.bn2.weight, detector.backbone.layer3.4.bn2.bias, detector.backbone.layer3.4.bn2.running_mean, detector.backbone.layer3.4.bn2.running_var, detector.backbone.layer3.4.conv3.weight, detector.backbone.layer3.4.bn3.weight, detector.backbone.layer3.4.bn3.bias, detector.backbone.layer3.4.bn3.running_mean, detector.backbone.layer3.4.bn3.running_var, detector.backbone.layer3.5.conv1.weight, detector.backbone.layer3.5.bn1.weight, detector.backbone.layer3.5.bn1.bias, detector.backbone.layer3.5.bn1.running_mean, detector.backbone.layer3.5.bn1.running_var, detector.backbone.layer3.5.conv2.weight, detector.backbone.layer3.5.bn2.weight, detector.backbone.layer3.5.bn2.bias, detector.backbone.layer3.5.bn2.running_mean, detector.backbone.layer3.5.bn2.running_var, detector.backbone.layer3.5.conv3.weight, detector.backbone.layer3.5.bn3.weight, detector.backbone.layer3.5.bn3.bias, detector.backbone.layer3.5.bn3.running_mean, detector.backbone.layer3.5.bn3.running_var, detector.backbone.layer4.0.conv1.weight, detector.backbone.layer4.0.bn1.weight, detector.backbone.layer4.0.bn1.bias, detector.backbone.layer4.0.bn1.running_mean, detector.backbone.layer4.0.bn1.running_var, detector.backbone.layer4.0.conv2.weight, detector.backbone.layer4.0.bn2.weight, detector.backbone.layer4.0.bn2.bias, detector.backbone.layer4.0.bn2.running_mean, detector.backbone.layer4.0.bn2.running_var, detector.backbone.layer4.0.conv3.weight, detector.backbone.layer4.0.bn3.weight, detector.backbone.layer4.0.bn3.bias, detector.backbone.layer4.0.bn3.running_mean, detector.backbone.layer4.0.bn3.running_var, detector.backbone.layer4.0.downsample.0.weight, detector.backbone.layer4.0.downsample.1.weight, detector.backbone.layer4.0.downsample.1.bias, detector.backbone.layer4.0.downsample.1.running_mean, detector.backbone.layer4.0.downsample.1.running_var, detector.backbone.layer4.1.conv1.weight, detector.backbone.layer4.1.bn1.weight, detector.backbone.layer4.1.bn1.bias, detector.backbone.layer4.1.bn1.running_mean, detector.backbone.layer4.1.bn1.running_var, detector.backbone.layer4.1.conv2.weight, detector.backbone.layer4.1.bn2.weight, detector.backbone.layer4.1.bn2.bias, detector.backbone.layer4.1.bn2.running_mean, detector.backbone.layer4.1.bn2.running_var, detector.backbone.layer4.1.conv3.weight, detector.backbone.layer4.1.bn3.weight, detector.backbone.layer4.1.bn3.bias, detector.backbone.layer4.1.bn3.running_mean, detector.backbone.layer4.1.bn3.running_var, detector.backbone.layer4.2.conv1.weight, detector.backbone.layer4.2.bn1.weight, detector.backbone.layer4.2.bn1.bias, detector.backbone.layer4.2.bn1.running_mean, detector.backbone.layer4.2.bn1.running_var, detector.backbone.layer4.2.conv2.weight, detector.backbone.layer4.2.bn2.weight, detector.backbone.layer4.2.bn2.bias, detector.backbone.layer4.2.bn2.running_mean, detector.backbone.layer4.2.bn2.running_var, detector.backbone.layer4.2.conv3.weight, detector.backbone.layer4.2.bn3.weight, detector.backbone.layer4.2.bn3.bias, detector.backbone.layer4.2.bn3.running_mean, detector.backbone.layer4.2.bn3.running_var, detector.neck.lateral_convs.0.conv.weight, detector.neck.lateral_convs.0.conv.bias, detector.neck.lateral_convs.1.conv.weight, detector.neck.lateral_convs.1.conv.bias, detector.neck.lateral_convs.2.conv.weight, detector.neck.lateral_convs.2.conv.bias, detector.neck.lateral_convs.3.conv.weight, detector.neck.lateral_convs.3.conv.bias, detector.neck.fpn_convs.0.conv.weight, detector.neck.fpn_convs.0.conv.bias, detector.neck.fpn_convs.1.conv.weight, detector.neck.fpn_convs.1.conv.bias, detector.neck.fpn_convs.2.conv.weight, detector.neck.fpn_convs.2.conv.bias, detector.neck.fpn_convs.3.conv.weight, detector.neck.fpn_convs.3.conv.bias, detector.rpn_head.rpn_conv.weight, detector.rpn_head.rpn_conv.bias, detector.rpn_head.rpn_cls.weight, detector.rpn_head.rpn_cls.bias, detector.rpn_head.rpn_reg.weight, detector.rpn_head.rpn_reg.bias, detector.roi_head.bbox_head.fc_cls.weight, detector.roi_head.bbox_head.fc_cls.bias, detector.roi_head.bbox_head.fc_reg.weight, detector.roi_head.bbox_head.fc_reg.bias, detector.roi_head.bbox_head.shared_fcs.0.weight, detector.roi_head.bbox_head.shared_fcs.0.bias, detector.roi_head.bbox_head.shared_fcs.1.weight, detector.roi_head.bbox_head.shared_fcs.1.bias

Issue fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

susanbao commented 1 year ago

By the way, is it possible to run ocsort with Faster-RCNN in mmtracking? If so, could you please tell me the method?

AtomScott commented 1 year ago

In mmtrack and mmdet, the key of the model is different. In mmtrack, a prefix "detector" will be added to the key. To verify this, you can load the PTH file and inspect the key.

Link to the issue

AtomScott commented 1 year ago

I encountered the same issue while trying to load a checkpoint file in my project. Here is the code that I used:

path = "/path/to/checkpoint.pth"
checkpoint = torch.load(path)
modified_checkpoint = {f"detector.{k}": v for k, v in checkpoint.items()}
torch.save(modified_checkpoint, path.replace("checkpoint", "modified_checkpoint"))

The code above successfully loads the checkpoint file and creates a modified version of it.