:herb: Learning to Track Objects from Unlabeled Videos
Jilai Zheng, Chao Ma, Houwen Peng and Xiaokang Yang
2021 IEEE/CVF International Conference on Computer Vision (ICCV)
This repository implements unsupervised deep tracker USOT, which learns to track objects from unlabeled videos.
Main ideas of USOT are listed as follows.
Results of USOT and USOT* on recent tracking benchmarks.
Model | VOT2016 EAO |
VOT2018 EAO |
VOT2020 EAO |
LaSOT AUC (%) |
TrackingNet AUC (%) |
OTB100 AUC (%) |
GOT10k AO (%) |
---|---|---|---|---|---|---|---|
USOT | 0.351 | 0.290 | 0.222 | 33.7 | 59.9 | 58.9 | 0.444 |
USOT* | 0.402 | 0.344 | 0.219 | 35.8 | 61.5 | 57.4 | 0.441 |
Raw result files can be found in folder result
from Google Drive.
The environment we utilize is listed as follows.
Train / Test / Eval: Pytorch 1.7.1 + CUDA-10.0 / 10.2 / 11.1
If you have problems for preprocessing, you can actually skip it by downloading off-the-shelf preprocessed materials.
Assume the project root path is $USOT_PATH
.
You can build an environment for development with the provided script,
where $CONDA_PATH
denotes your anaconda path.
cd $USOT_PATH
bash ./preprocessing/install_model.sh $CONDA_PATH USOT
source activate USOT && export PYTHONPATH=$(pwd)
You can revise the CUDA toolkit version for pytorch in install_model.sh
(by default 10.0).
First, we provide both models utilized in our paper (USOT.pth
and USOT_star.pth
).
You can download them in folder snapshot
from Google Drive,
and place them in $USOT_PATH/var/snapshot
.
Next, you can link your wanted benchmark dataset (e.g. VOT2018) to $USOT_PATH/datasets_test
as follows.
The ground truth json files for some benchmarks (e.g VOT2018.json
)
can be downloaded in folder test
from Google Drive,
and placed also in $USOT_PATH/datasets_test
.
cd $USOT_PATH && mkdir datasets_test
ln -s $your_benchmark_path ./datasets_test/VOT2018
After that, you can test the tracker on these benchmarks (e.g. VOT2018) as follows.
The raw results will be placed in $USOT_PATH/var/result/VOT2018/USOT
.
cd $USOT_PATH
python -u ./scripts/test_usot.py --dataset VOT2018 --resume ./var/snapshot/USOT_star.pth
The inference result can be evaluated with pysot-toolkit. Install pysot-toolkit before evaluation.
cd $USOT_PATH/lib/eval_toolkit/pysot/utils
python setup.py build_ext --inplace
Then the evaluation can be conducted as follows.
cd $USOT_PATH
python ./lib/eval_toolkit/bin/eval.py --dataset_dir datasets_test \
--dataset VOT2018 --tracker_result_dir var/result/VOT2018 --trackers USOT
First, download the pretrained backbone in folder pretrain
from Google Drive
into $USOT_PATH/pretrain
.
Note that USOT* and USOT are respectively trained from imagenet_pretrain.model
and moco_v2_800.model
.
Second, preprocess the raw datasets with the paradigm of DP + Flow.
Refer to $USOT_PATH/preprocessing/datasets_train
for details.
In fact, we have provided two shortcuts for skipping this preprocessing procedure.
got10k_flow.json
)
in folder train/box_sample_result
from Google Drive,
and place them into the corresponding dataset preprocessing path
(e.g. $USOT_PATH/preprocessing/datasets_train/got10k
), in order to skip the box generation procedure.You can directly download the whole cropped training dataset (e.g. got10k_flow.tar
) in
dataset folder from Baidu Drive (6887) (e.g. train/GOT-10k
),
which enables you to skip all procedures in preprocessing.
Third, revise the config file for training as $USOT_PATH/experiments/train/USOT.yaml
.
Very important options are listed as follows.
DATASET: the folder for your cropped training instances and their pseudo annotation files, e.g. PATH: '/data/got10k_flow/crop511/', ANNOTATION: '/data/got10k_flow/train.json'
Finally, you can start the training phase with the following script.
The training checkpoints will also be placed automatically in $USOT_PATH/var/snapshot
.
cd $USOT_PATH
python -u ./scripts/train_usot.py --cfg experiments/train/USOT.yaml --gpus 0,1,2,3 --workers 32
We also provide a onekey script for train, test and eval.
cd $USOT_PATH
python ./scripts/onekey_usot.py --cfg experiments/train/USOT.yaml
If any parts of our paper and codes are helpful to your work, please generously citing:
@inproceedings{zheng-iccv2021-usot,
title={Learning to Track Objects from Unlabeled Videos},
author={Jilai Zheng and Chao Ma and Houwen Peng and Xiaokang Yang},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2021}
}
We refer to the following repositories when implementing our unsupervised tracker. Thanks for their great work.
Feel free to contact me if you have any questions.