ymzis69 / HybridSORT

[AAAI2024]Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking
MIT License
190 stars 26 forks source link

Hybrid-SORT

License: MIT test

Hybrid-SORT is a simply and strong multi-object tracker.

Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

Abstract

Multi-Object Tracking (MOT) aims to detect and associate all desired objects across frames. Most methods accomplish the task by explicitly or implicitly leveraging strong cues (i.e., spatial and appearance information), which exhibit powerful instance-level discrimination. However, when object occlusion and clustering occur, both spatial and appearance information will become ambiguous simultaneously due to the high overlap between objects. In this paper, we demonstrate that this long-standing challenge in MOT can be efficiently and effectively resolved by incorporating weak cues to compensate for strong cues. Along with velocity direction, we introduce the confidence state and height state as potential weak cues. With superior performance, our method still maintains Simple, Online and Real-Time (SORT) characteristics. Furthermore, our method shows strong generalization for diverse trackers and scenarios in a plug-and-play and training-free manner. Significant and consistent improvements are observed when applying our method to 5 different representative trackers. Further, by leveraging both strong and weak cues, our method Hybrid-SORT achieves superior performance on diverse benchmarks, including MOT17, MOT20, and especially DanceTrack where interaction and occlusion are frequent and severe.

Highlights

Pipeline

News

Tracking performance

Results on DanceTrack test set

Tracker HOTA MOTA IDF1 FPS
OC-SORT 54.6 89.6 54.6 30.3
Hybrid-SORT 62.2 91.6 63.0 27.8
Hybrid-SORT-ReID 65.7 91.8 67.4 15.5

Results on MOT20 challenge test set

Tracker HOTA MOTA IDF1
OC-SORT 62.1 75.5 75.9
Hybrid-SORT 62.5 76.4 76.2
Hybrid-SORT-ReID 63.9 76.7 78.4

Results on MOT17 challenge test set

Tracker HOTA MOTA IDF1
OC-SORT 63.2 78.0 77.5
Hybrid-SORT 63.6 79.3 78.4
Hybrid-SORT-ReID 64.0 79.9 78.7

Installation

Hybrid-SORT code is based on OC-SORT and FastReID. The ReID component is optional and based on FastReID. Tested the code with Python 3.8 + Pytorch 1.10.0 + torchvision 0.11.0.

Step1. Install Hybrid_SORT

git clone https://github.com/ymzis69/HybridSORT.git
cd HybridSORT
pip3 install -r requirements.txt
python3 setup.py develop

Step2. Install pycocotools.

pip3 install cython; pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Step3. Others

pip3 install cython_bbox pandas xmltodict

Step4. [optional] FastReID Installation

You can refer to FastReID Installation.

pip install -r fast_reid/docs/requirements.txt

Data preparation

Our data structure is the same as OC-SORT.

  1. Download MOT17, MOT20, CrowdHuman, Cityperson, ETHZ, DanceTrack, CUHKSYSU and put them under /datasets in the following structure (CrowdHuman, Cityperson and ETHZ are not needed if you download YOLOX weights from ByteTrack or OC-SORT) :

    datasets
    |——————mot
    |        └——————train
    |        └——————test
    └——————crowdhuman
    |        └——————Crowdhuman_train
    |        └——————Crowdhuman_val
    |        └——————annotation_train.odgt
    |        └——————annotation_val.odgt
    └——————MOT20
    |        └——————train
    |        └——————test
    └——————Cityscapes
    |        └——————images
    |        └——————labels_with_ids
    └——————ETHZ
    |        └——————eth01
    |        └——————...
    |        └——————eth07
    └——————CUHKSYSU
    |        └——————images
    |        └——————labels_with_ids
    └——————dancetrack        
            └——————train
               └——————train_seqmap.txt
            └——————val
               └——————val_seqmap.txt
            └——————test
               └——————test_seqmap.txt
  2. Prepare DanceTrack dataset:

    # replace "dance" with ethz/mot17/mot20/crowdhuman/cityperson/cuhk for others
    python3 tools/convert_dance_to_coco.py 
  3. Prepare MOT17/MOT20 dataset.

    # build mixed training sets for MOT17 and MOT20 
    python3 tools/mix_data_{ablation/mot17/mot20}.py
  4. [optional] Prepare ReID datasets:

    cd <HYBRIDSORT_HOME>
    
    # For MOT17 
    python3 fast_reid/datasets/generate_mot_patches.py --data_path <dataets_dir> --mot 17
    
    # For MOT20
    python3 fast_reid/datasets/generate_mot_patches.py --data_path <dataets_dir> --mot 20
    
    # For DanceTrack
    python3 fast_reid/datasets/generate_cuhksysu_dance_patches.py --data_path <dataets_dir> 

Model Zoo

Download and store the trained models in 'pretrained' folder as follow:

<HYBRIDSORT_HOME>/pretrained

Detection Model

We provide some pretrained YOLO-X weights for Hybrid-SORT, which are inherited from ByteTrack.

Dataset HOTA IDF1 MOTA Model
DanceTrack-val 59.3 60.6 89.5 Google Drive
DanceTrack-test 62.2 63.0 91.6 Google Drive
MOT17-half-val 67.1 78.0 75.8 Google Drive
MOT17-test 63.6 78.7 79.9 Google Drive
MOT20-test 62.5 78.4 76.7 Google Drive

ReID Model

Ours ReID models for MOT17/MOT20 is the same as BoT-SORT , you can download from MOT17-SBS-S50, MOT20-SBS-S50, ReID models for DanceTrack is trained by ourself, you can download from DanceTrack.

Notes:

Training

Train the Detection Model

You can use Hybrid-SORT without training by adopting existing detectors. But we borrow the training guidelines from ByteTrack in case you want work on your own detector.

Download the COCO-pretrained YOLOX weight here and put it under \<HYBRIDSORT_HOME>/pretrained.

Train the ReID Model

After generating MOT ReID dataset as described in the 'Data Preparation' section.

cd <BoT-SORT_dir>

# For training MOT17 
python3 fast_reid/tools/train_net.py --config-file ./fast_reid/configs/MOT17/sbs_S50.yml MODEL.DEVICE "cuda:0"

# For training MOT20
python3 fast_reid/tools/train_net.py --config-file ./fast_reid/configs/MOT20/sbs_S50.yml MODEL.DEVICE "cuda:0"

# For training DanceTrack, we joint the CHUKSUSY to train ReID Model for DanceTrack
python3 fast_reid/tools/train_net.py --config-file ./fast_reid/configs/CUHKSYSU_DanceTrack/sbs_S50.yml MODEL.DEVICE "cuda:0"

Refer to FastReID repository for addition explanations and options.

Tracking

Notes:

DanceTrack

dancetrack-val dataset

# Hybrid-SORT
python tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_dancetrack_val_hybrid_sort.py -b 1 -d 1 --fp16 --fuse --expn $exp_name 

# Hybrid-SORT-ReID
python tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_dancetrack_val_hybrid_sort_reid.py -b 1 -d 1 --fp16 --fuse --expn $exp_name

dancetrack-test dataset

# Hybrid-SORT
python tools/run_hybrid_sort_dance.py --test -f exps/example/mot/yolox_dancetrack_test_hybrid_sort.py -b 1 -d 1 --fp16 --fuse --expn $exp_name

# Hybrid-SORT-ReID
python tools/run_hybrid_sort_dance.py --test -f exps/example/mot/yolox_dancetrack_test_hybrid_sort_reid.py -b 1 -d 1 --fp16 --fuse --expn $exp_name

MOT20

MOT20-test dataset

#Hybrid-SORT
python tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_mix_mot20_ch_hybrid_sort.py -b 1 -d 1 --fuse --mot20 --expn $exp_name 

#Hybrid-SORT-ReID
python tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_mix_mot20_ch_hybrid_sort_reid.py -b 1 -d 1 --fuse --mot20 --expn $exp_name

Hybrid-SORT is designed for online tracking, but offline interpolation has been demonstrated efficient for many cases and used by other online trackers. If you want to reproduct out result on MOT20-test dataset, please use the linear interpolation over existing tracking results:

# offline post-processing
python3 tools/interpolation.py $result_path $save_path

MOT17

MOT17-val dataset

# Hybrid-SORT
python3 tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_ablation_hybrid_sort.py -b 1 -d 1 --fuse --expn $exp_name 

# Hybrid-SORT-ReID
python3 tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_ablation_hybrid_sort_reid.py -b 1 -d 1 --fuse --expn  $exp_name 

MOT17-test dataset

# Hybrid-SORT
python3 tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_mix_det_hybrid_sort.py -b 1 -d 1 --fuse --expn $exp_name

# Hybrid-SORT-ReID
python3 tools/run_hybrid_sort_dance.py -f exps/example/mot/yolox_x_mix_det_hybrid_sort_reid.py -b 1 -d 1 --fuse --expn $exp_name

Hybrid-SORT is designed for online tracking, but offline interpolation has been demonstrated efficient for many cases and used by other online trackers. If you want to reproduct out result on MOT17-test dataset, please use the linear interpolation over existing tracking results:

# offline post-processing
python3 tools/interpolation.py $result_path $save_path

Demo

Hybrid-SORT, with the parameter settings of the dancetrack-val dataset

python3 tools/demo_track.py --demo_type image -f exps/example/mot/yolox_dancetrack_val_hybrid_sort.py -c pretrained/ocsort_dance_model.pth.tar --path ./datasets/dancetrack/val/dancetrack0079/img1 --fp16 --fuse --save_result

Hybrid-SORT-ReID, with the parameter settings of the dancetrack-val dataset

python3 tools/demo_track.py --demo_type image -f exps/example/mot/yolox_dancetrack_val_hybrid_sort_reid.py -c pretrained/ocsort_dance_model.pth.tar --path ./datasets/dancetrack/val/dancetrack0079/img1 --fp16 --fuse --save_result
demo

TCM on other trackers

download ReID weight from googlenet_part8_all_xavier_ckpt_56.h5 for MOTDT and DeepSORT.

dancetrack-val dataset

# SORT
python tools/run_sort_dance.py -f exps/example/mot/yolox_dancetrack_val.py -c pretrained/bytetrack_dance_model.pth.tar -b 1 -d 1 --fp16 --fuse --dataset dancetrack --expn sort_score_kalman_fir_step --TCM_first_step

# MOTDT
python3 tools/run_motdt_dance.py -f exps/example/mot/yolox_dancetrack_val.py -c pretrained/bytetrack_dance_model.pth.tar -b 1 -d 1 --fp16 --fuse --dataset dancetrack --expn motdt_score_kalman_fir_step --TCM_first_step

# ByteTrack
python3 tools/run_byte_dance.py -f exps/example/mot/yolox_dancetrack_val.py -c pretrained/bytetrack_dance_model.pth.tar -b 1 -d 1 --fp16 --fuse --dataset dancetrack --expn byte_score_kalman_fir_step --TCM_first_step

# DeepSORT
python3 tools/run_deepsort_dance.py -f exps/example/mot/yolox_dancetrack_val.py -c pretrained/bytetrack_dance_model.pth.tar -b 1 -d 1 --fp16 --fuse --dataset dancetrack --expn deepsort_score_kalman_fir_step --TCM_first_step

mot17-val dataset

# SORT
python3 tools/run_sort.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/ocsort_mot17_ablation.pth.tar -b 1 -d 1 --fuse --expn mot17_sort_score_test_fp32 --TCM_first_step

# MOTDT
python3 tools/run_motdt.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/ocsort_mot17_ablation.pth.tar -b 1 -d 1 --fuse --expn mot17_motdt_score_test_fp32 --TCM_first_step

# ByteTrack
python3 tools/run_byte.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/ocsort_mot17_ablation.pth.tar -b 1 -d 1 --fuse --expn mot17_byte_score_test_fp32 --TCM_first_step --TCM_first_step_weight 0.6

# DeepSORT
python3 tools/run_deepsort.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/ocsort_mot17_ablation.pth.tar -b 1 -d 1 --fuse --expn mot17_deepsort_score_test_fp32 --TCM_first_step

Citation

If you find this work useful, please consider to cite our paper:

@inproceedings{yang2024hybrid,
  title={Hybrid-sort: Weak cues matter for online multi-object tracking},
  author={Yang, Mingzhan and Han, Guangxin and Yan, Bin and Zhang, Wenhua and Qi, Jinqing and Lu, Huchuan and Wang, Dong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={7},
  pages={6504--6512},
  year={2024}
}

Acknowledgement

A large part of the code is borrowed from YOLOX, OC-SORT, ByteTrack, BoT-SORT and FastReID. Many thanks for their wonderful works.