VISION-SJTU / USOT

[ICCV2021] Learning to Track Objects from Unlabeled Videos
63 stars 7 forks source link

VOT2020 test #8

Closed zrz1018 closed 1 year ago

zrz1018 commented 2 years ago

Hello, how did you complete the VOT2020 test evaluation? I have seen your comment that it is done by using test_vot2020.py file and vot toolkit. Could you please tell me more about it? Thank you again for your excellent work.

zrz1018 commented 2 years ago

Another thing I want to know about the GOT10K dataset is that there is no time.txt file for the test results. So how do you evaluate it.

zhengjilai commented 2 years ago

Q1: How to evaluate on VOT2020?

Please follow vot toolkit for evaluating on VOT2020. Some important code materials needed are provided in the project, and you may need to install the toolkit and additionally write a specific trackers.ini file. Since I have never checked the VOT2020 testing script when cleaning the code, there may be some problems or system-aware variables/paths existing in the provided script. You can fix it adaptively on your own.

Q2: How to evaluate on GOT-10k?

We have never evaluated USOT on GOT-10k, and the provided version of pysot_toolkit does not support it (missing time.txt). Fortunately, the fixing of the testing script is easy. You can refer to pysot_toolkit in recent projects and check how to revise the corresponding script for getting the time.txt file.

zrz1018 commented 2 years ago

Ok, thank you for your answer

zhengjilai commented 2 years ago

Ok, thank you for your answer

Hi. Recently, some other researchers contact me and ask for the results of USOT and USOT* on the GOT-10k benchmark for comparison. Thus, I follow the GOT-10k protocol and re-train the model on the GOT-10k training set only, and evaluate on the GOT-10k testing set. The performance are listed by AO, SR_0.50, SR_0.75 as follows.

USOT (moco_v2 backbone): 0.444, 0.531, 0.185 USOT* (ImageNet supervised backbone) : 0.441, 0.523, 0.186

If you want the raw result files, they have been uploaded to the GoogleDrive (together with results on other benchmarks). These results can be used for valid performance comparison on the GOT-10k benchmark. I hope this is not too late and will help you.