open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.58k stars 598 forks source link

How to select classes of which outputs from Detector model to be fed into reid model ? #564

Closed Balakumaran-kandula closed 2 years ago

Balakumaran-kandula commented 2 years ago

I have trained the detector model in mmdetection with multiple classes , if i want to fed the "person" class outputs alone from the detector model to the reid model during inference , can i do that using config or any other method ?

And also if i have to fed the mmdetection pretrained model into tracker , what are the config changes have to be done ?

Thank you in advance

pixeli99 commented 2 years ago

Hi, Balakumaran-kandula,

Thank you for your question.

For the question 1, the current version does not support modifying the config file to meet your requirements. In my opinion, you need to modify some code to complete it. Take bytetrack as an example, you can process the output of the detector here.https://github.com/open-mmlab/mmtracking/blob/c250394b8a9ca95dae2ad49efe2d92ae450f605a/mmtrack/models/mot/byte_track.py#L65 For the question 2, you need to modify the key of the weight file. This is mainly because the keys(means key in dict) of mmtrack and mmdet weight files are different. Here is a example to fix it. In fact, you just need to add 'detector' in front of the ori key.

import torch
from collections import OrderedDict

new_dict = OrderedDict()
key1 = []
weight_dict = torch.load("your.pth")
weight_dict = weight_dict['state_dict']

for (k1, v1) in weight_dict.items():
    new_dict['detector.' + k1] = v1
torch.save(new_dict, 'new_track.pth')

Hope this will help you. If you have any questions, please leave your comment here.🚀

Balakumaran-kandula commented 2 years ago

Thank you , the weight changes is made , now the detection is happening , but the detection is random , is it anything about the pipeline , because i have even tried with the detctor config from the loaded model and also with detector config inside mmtracking image

pixeli99 commented 2 years ago

Could you paste the config.py here?

Balakumaran-kandula commented 2 years ago

detector config as

model = dict( detector=dict( type='FasterRCNN', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict( type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[.0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)), roi_head=dict( type='StandardRoIHead', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict( type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=3, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0., 0., 0., 0.], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', loss_weight=1.0))), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=-1, pos_weight=-1, debug=False), rpn_proposal=dict( nms_pre=2000, max_per_img=1000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_pre=1000, max_per_img=1000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100))

soft-nms is also supported for rcnn testing

    # e.g., nms=dict(type='soft_nms', iou_threshold=0.5, min_score=0.05)
))

tracker config

base = [ '../../base/models/faster_rcnn_r50_fpn.py', '../../base/datasets/mot_challenge.py', '../../base/default_runtime.py' ]

model = dict( type='Tracktor', detector=dict( rpn_head=dict(bbox_coder=dict(clip_border=False)), roi_head=dict( bbox_head=dict(bbox_coder=dict(clip_border=False), num_classes=1)), init_cfg=dict( type='Pretrained', checkpoint= "/tracking/mmtracking/demo/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car_20201216_173117-6eda6d92.pth" # noqa: E251

'https://download.openmmlab.com/mmtracking/mot/faster_rcnn/faster-rcnn_r50_fpn_4e_mot17-half-64ee2ed4.pth' # noqa: E501

    )),
reid=dict(
    type='BaseReID',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(3, ),
        style='pytorch'),
    neck=dict(type='GlobalAveragePooling', kernel_size=(8, 4), stride=1),
    head=dict(
        type='LinearReIDHead',
        num_fcs=1,
        in_channels=2048,
        fc_channels=1024,
        out_channels=128,
        num_classes=380,
        loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
        loss_pairwise=dict(
            type='TripletLoss', margin=0.3, loss_weight=1.0),
        norm_cfg=dict(type='BN1d'),
        act_cfg=dict(type='ReLU')),
    init_cfg=dict(
        type='Pretrained',
        checkpoint=  # noqa: E251
        'https://download.openmmlab.com/mmtracking/mot/reid/reid_r50_6e_mot17-4bf6b63d.pth'  # noqa: E501
    )),
motion=dict(
    type='CameraMotionCompensation',
    warp_mode='cv2.MOTION_EUCLIDEAN',
    num_iters=100,
    stop_eps=0.00001),
tracker=dict(
    type='TracktorTracker',
    obj_score_thr=0.5,
    regression=dict(
        obj_score_thr=0.5,
        nms=dict(type='nms', iou_threshold=0.6),
        match_iou_thr=0.3),
    reid=dict(
        num_samples=10,
        img_scale=(256, 128),
        img_norm_cfg=None,
        match_score_thr=2.0,
        match_iou_thr=0.2),
    momentums=None,
    num_frames_retain=10))

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=100, warmup_ratio=1.0 / 100, step=[3])

runtime settings

total_epochs = 4 evaluation = dict(metric=['bbox', 'track'], interval=1) search_metrics = ['MOTA', 'IDF1', 'FN', 'FP', 'IDs', 'MT', 'ML']

pixeli99 commented 2 years ago

There seems to be no problem with the configuration file. I suspect it is your mmdet version problem. Please make sure that the mmdet pre training model you use is consistent with the mmdet version installed in the mmtrack environment

pixeli99 commented 2 years ago

You can follow the above method to confirm whether the names of each key are consistent. It is better to also confirm the shape of value at the same time.