open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.54k stars 594 forks source link

Questions about multi-classes MOT with ByTrack/QDTrack #491

Open AndrewGuo0930 opened 2 years ago

AndrewGuo0930 commented 2 years ago

Hi! Could you please tell me how can I use ByteTrack or QDTrack to implement multi-classes MOT? It seems that the existing configurations are only for one-class(pedestrian). I wonder if my customized datasets have 4 classes and I've already converted the annotations to CocoVID format. Additionally I've already trained a detector like Faster-RCNN for my datasets and got the output detection results in .json or .pkl. How can I use the results or model I have to implement multi-classes MOT task with tools that MMTracking provided? It's very important for me since my work is not about pedestrian MOT. Thank you so much!

Seerkfang commented 2 years ago

In the original paper, ByteTrack only support tracking for pedestrian, and there is no operation for different classes, so you have to modify the code yourself. But if you don't change the official training methods, the training results may not be ideal for multi-classes situations. On the other hand, the supported QDTrack only works on MOT17, too. However, the unfinished pr contains methods like inter-class NMS and class specified tracking, you can see the tracker file here https://github.com/open-mmlab/mmtracking/pull/465/files. The tracker's logic is tested to be correct.

AndrewGuo0930 commented 2 years ago

So it means the tracker's logic for TAO is tested to be correct. What about the code? If I want to implement multi-classes MOT on my own datasets, can I just modify the code according to the sample here https://github.com/open-mmlab/mmtracking/pull/465/files? Another question as I'm new to MOT and MMTracking. I wonder why we should write different codes for different datasets? Can we just convert different datasets to a general type and use a general code set to finish tasks? Thank you for your patience in answering my questions!

Seerkfang commented 2 years ago

You need to understand how it works for multi-classes. There is some class-aware operation in the code, and I think that's what you need. The training part of QDTrack on TAO still has some difference (0.8 track AP) compared to the official results.

As for your second question, you should understand which functions we've implemented for every dataset, a general type is not ideal. Besides, datasets in MOT inherit from CocoVideoDataset.

AndrewGuo0930 commented 2 years ago

Hi @Seerkfang ! I have an idea, but I don't know if it will work. Earlier I encountered some problems when I was working on a multi-class MOT task. Now I have an idea. Can I train 4 models for each class in my datasets? I will use my pretrained detector model as a public detector for the 4 classes, train 4 trackors for each class with DeepSORT/ByteTrack/QDTrack, and finally merge all the results together for testing or visualization. Could this pipeline work?

AndrewGuo0930 commented 2 years ago

Another question, when I tried to use ATSS as the detector of QDTrack, it reported the error ATSS with_rpn=false. Does it mean that ATSS can't be used as the detector of QDTrack?

noahcao commented 2 years ago

@AndrewGuo0930 I made a support to multi-class tracking by ByteTrack as PR #548 , you may want to migrate it to other trackers by the logic inside.

noahcao commented 2 years ago

@AndrewGuo0930 Usually each issue is only for one question. For other problems, please initialize new issues with proper titles. Thanks.

AndrewGuo0930 commented 2 years ago

Oh thank you so much! I'll try it. @noahcao