feichtenhofer / Detect-Track

Code release for "Detect to Track and Track to Detect", ICCV 2017
http://www.robots.ox.ac.uk/~vgg/research/detect-track/
Other
551 stars 110 forks source link

===============================================================================

Detect to Track and Track to Detect

This repository contains the code for our ICCV 2017 paper:

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
"Detect to Track and Track to Detect"
in Proc. ICCV 2017

If you find the code useful for your research, please cite our paper:

    @inproceedings{feichtenhofer2017detect,
      title={Detect to Track and Track to Detect},
      author={Feichtenhofer, Christoph and Pinz, Axel and Zisserman, Andrew},
      booktitle={International Conference on Computer Vision (ICCV)},
      year={2017}
    }

Requirements

The code was tested on Ubuntu 14.04, 16.04 and Windows 10 using NVIDIA Titan X or Z GPUs.

If you have questions regarding the implementation please contact:

Christoph Feichtenhofer <feichtenhofer AT tugraz.at>

================================================================================

Setup

  1. Download the code git clone --recursive https://github.com/feichtenhofer/detect-track

    • This will also download a modified version of the Caffe deep learning framework. In case of any issues, please follow the installation instructions in the corresponding README as well as on the Caffe website.
  2. Compile the code by running rfcn_build.m.

  3. Edit the file get_root_path.m to adjust the models and data paths.

    • Download the ImageNet VID dataset from http://image-net.org/download-images
    • Download pretrained model files and the RPN proposals, linked below and unpack them into your models/data directory.
    • In case the models are not present, the function check_dl_model will attempt to download the model to the respective directories
    • In case the RPN files are not present, the function download_proposals will attempt to download & extract the proposal files to the respective directories

      Training

    • You can train your own models on ImageNet VID as follows
    • script_Detect_ILSVRC_vid_ResNet_OHEM_rpn(); to train the image-based Detection network.
    • script_DetectTrack_ILSVRC_vid_ResNet_OHEM_rpn(); to train the video-based Detection & Tacking network.

Testing

Results on ImageNet VID

Method test structure ResNet-50 ResNet-101 ResNeXt-101 Inception-v4
Detect test.prototxt 72.1 74.1 75.9 77.9
Detect & Track test_track.prototxt 76.5 79.8 81.4 82.0
Detect & Track test_track_regcls.prototxt 76.7 80.0 81.6 82.1

Trained models

Data

Our models were trained using region proposals extracted using a Region Proposal Network that is trained on the same data as D&T. We use the RPN from craftGBD and provide the extracted proposals for training and testing on ImageNet VID and the DET subsets below.

Pre-computed object proposals for