We are hiring research interns for visual tracking and neural architecture search projects: houwen.peng@microsoft.com
Siamese networks have drawn great attention in visual tracking because of their balanced accuracy and speed. However, the backbone network utilized in these trackers is still the classical AlexNet, which does not fully take advantage of the capability of modern deep neural networks.
Our proposals improve the performances of fully convolutional siamese trackers by, 1) introducing CIR and CIR-D units to unveil the power of deeper and wider networks like ResNet and Inceptipon; 2) designing backbone networks according to the analysis on internal network factors (e.g. receptive field, stride, output feature size), which affect tracking performances.
Models | OTB13 | OTB15 | VOT15 | VOT16 | VOT17 |
---|---|---|---|---|---|
Alex-FC | 0.608 | 0.579 | 0.289 | 0.235 | 0.188 |
Alex-RPN | - | 0.637 | 0.349 | 0.344 | 0.244 |
CIResNet22-FC | 0.663 | 0.644 | 0.318 | 0.303 | 0.234 |
CIResIncep22-FC | 0.662 | 0.642 | 0.310 | 0.295 | 0.236 |
CIResNext23-FC | 0.659 | 0.633 | 0.297 | 0.278 | 0.229 |
CIResNet22-RPN | 0.674 | 0.666 | 0.381 | 0.376 | 0.294 |
Models | OTB13 | OTB15 | VOT15 | VOT16 | VOT17 |
---|---|---|---|---|---|
Alex-FC | - | - | - | - | 0.188 |
CIResNet22-FC | 0.664 | 0.654 | 0.361 | 0.335 | 0.266 |
CIResNet22W-FC | 0.689 | 0.674 | 0.368 | 0.352 | 0.269 |
CIResIncep22-FC | 0.673 | 0.650 | 0.332 | 0.305 | 0.251 |
CIResNext22-FC | 0.668 | 0.651 | 0.336 | 0.304 | 0.246 |
Raw Results | :paperclip: OTB2013 | :paperclip: OTB2015 | :paperclip: VOT15 | :paperclip: VOT16 | :paperclip: VOT17 |
Benchmark | VOT18 | VOT19 | GOT10K | VISDRONE19 | LaSOT |
---|---|---|---|---|---|
Performance | 0.270 | 0.242 | 0.416 | 0.383 | 0.384 |
Raw Results | :paperclip: VOT18 | :paperclip: VOT19 | :paperclip: GOT10K | :paperclip: VISDRONE | :paperclip: LaSOT |
The code is developed with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz GPU: NVIDIA .GTX1080
See details in test.md
See details in train.md
:cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud::cloud:
If any part of our paper and code is helpful to your work, please generously cite with:
@InProceedings{SiamDW_2019_CVPR,
author = {Zhang, Zhipeng and Peng, Houwen},
title = {Deeper and Wider Siamese Networks for Real-Time Visual Tracking},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}
Licensed under an MIT license.