open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.58k stars 598 forks source link

Could mmtracking support anchor-free detectors for VID #606

Open iamweiweishi opened 2 years ago

iamweiweishi commented 2 years ago

Currently, the SELSA use the faster RCNN as the main structure for video detetion. Anchor free detectors, like YOLOX, are showing powerful performance. I wonder could I use YOLOX as the main network structure for VID

JingweiZhang12 commented 2 years ago

SELSA only supports two-stage detectors.

Chop1 commented 2 years ago

SELSA only supports two-stage detectors.

This is because the SELSA module is design to aggregate proto detection; it weights the proto detection of reference frames based on target frames proto detection, then aggregate them to obtain more rich and resilient feature. By weighting proto detection of proposal frames according to target proto detection, you basiccaly ensure that only the detection of the same instance object contribute to the aggregation.