π Our work is accepted by ACCV 2024 MLCSA Workshop.
π Read the paper on arXiv.
The following videos demonstrate the tracking performance on a SoccerNet sequence. Notice how the referee in the yellow jersey is correctly identified after applying the GTA refinement process.
This project introduces a universal, model-agnostic method designed to refine and enhance tracklet association for single-camera Multi-Object Tracking (MOT). The method is primarily developed for datasets such as SportsMOT or SoccerNet but is also applicable to any MOT datasets. The approach is developed as an offline post-processing tool.
Our method is model-agnostic, meaning it operates independently of the tracking models used to generate initial data. It only requires the tracking results from these models in text file MOT format and performs tracklet refinement through two main components: the tracklet splitter and the tracklet connector.
The tracklet splitter addresses the issue of impure tracklets that contain multiple identities. It utilizes unsupervised density-based spatial clustering, DBSCAN, to cluster the instances within a tracklet based on their deep feature embeddings and to detect if there is an ID switch.
After the splitter processes the tracklets, the connector component iteratively merges tracklet pairs based on their averaged cosine distance between all tracklet instances' deep feature embeddings.
This refinement tool helps enhance tracking results from any tracker for an MOT task from any dataset. The tracklets are generated by the reid model, OSNet, which is trained on the SportsMOT dataset. You can also train and load your own reid model for tracklet generation. While the splitting and connecting components are train-free, there are parameters to tune for the clustering algorithms of both components.
Clone the project repository:
git clone https://github.com/sjc042/gta-link
cd gta-link
Create a Python environment with version 3.8
conda create -n gta_link python=3.8
Activate the environment
conda activate gta_link
Install the required packages from requirements.txt
pip install -r requirements.txt
Install PyTorch (project tested on PyTorch 2.3.0 and CUDA 11.8, modify the command to match your CUDA version)
pip install torch torchvision torchaudio
Install the torchreid model based on OSNet
cd reid
python setup.py develop
cd gta-link
mkdir reid
git clone https://github.com/KaiyangZhou/deep-person-reid.git
NOTE:
Make sure to replace placeholders (e.g., {}
) with actual values when running the commands.
Generate tracklets with your own tracking results:
python generate_tracklets.py --model_path {reid model weight}
--data_path {dataset directory}
--pred_dir {tracking results direcotry}
--tracker {tracker name}
--model_path
: Specify the path to reid model's checkpoint file (default is ../reid_checkpoints/sports_model.pth.tar-60
)--data_path
: Specify directory of dataset's split data (e.g. SoccerNet/tracking-2023/test
).--pred_dir
: Specify the directory containing the tracking results .txt
files. The files should be organized as shown below:
pred_dir (e.g., DeepEIoU_Baseline)
βββ seq1.txt
βββ seq2.txt
βββ ...
--tracker
: Indicate tracker's name for file saving.Note: a video sequence's tracklets will be saved as .pkl
files in diretory parallel to pred_dir
e.g.:
βββpred_dir (e.g., DeepEIoU_Baseline)
βββDeepEIoU_Tracklets_test
βββ seq1.pkl
βββ seq2.pkl
βββ ...
Refine tracklets (in project root directory):
python refine_tracklets.py --dataset {dataset name}
--tracker {tracker name}
--track_src {source directory of tracklet pkl files}
--use_split
--min_len 100
--eps 0.6
--min_samples 10
--max_k 3
--use_connect
--spatial_factor 1.0
--merge_dist_thres 0.4
--dataset
: Specify the dataset name (e.g., SportsMOT, SoccerNet) for file saving.--tracker
: Specify the tracker name for file saving.--track_src
: Path to the directory containing the tracklet .pkl
files.--use_split
: Include this flag to use the split component.--min_len
: Minimum length for a tracklet required for splitting (default is 100).--eps
: Maximum distance between two samples for DBSCAN clustering (default is 0.6).--min_samples
: Minimum number of samples in a neighborhood for a point to be considered a core point (default is 10).--max_k
: Maximum number of clusters/subtracklets to be output by the splitting component (default is 3).--use_connect
: Include this flag to use the connecting component.--spatial_factor
: Factor to adjust spatial constraints (default is 1.0).--merge_dist_thres
: Minimum cosine distance between two tracklets for merging (default is 0.4).