zlai0 / MAST

MAST: A Memory-Augmented Self-supervised Tracker (CVPR 2020)
https://zlai0.github.io/MAST/
273 stars 32 forks source link

MAST: A Memory-Augmented Self-supervised Tracker

This repository contains the code (in PyTorch) for the model introduced in the following paper

MAST: A Memory-Augmented Self-supervised Tracker

Figure

Citation

@InProceedings{Lai20,
  author       = "Zihang Lai and Erika Lu and Weidi Xie",
  title        = "{MAST}: {A} Memory-Augmented Self-Supervised Tracker",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition",
  year         = "2020",
}

Contents

  1. Introduction
  2. Usage
  3. Results
  4. Contacts

Introduction

Recent interest in self-supervised dense tracking has yielded rapid progress, but performance still remains far from supervised methods. We propose a dense tracking model trained on videos without any annotations that surpasses previous self-supervised methods on existing benchmarks by a significant margin (+15%), and achieves performance comparable to supervised methods. In this paper, we first reassess the traditional choices used for self-supervised training and reconstruction loss by conducting thorough experiments that finally elucidate the optimal choices. Second, we further improve on existing methods by augmenting our architecture with a crucial memory component. Third, we benchmark on large-scale semi-supervised video object segmentation(aka. dense tracking), and propose a new metric: generalizability. Our first two contributions yield a self-supervised network that for the first time is competitive with supervised methods on standard evaluation metrics of dense tracking. When measuring generalizability, we show self-supervised approaches are actually superior to the majority of supervised methods. We believe this new generalizability metric can better capture the real-world use-cases for dense tracking, and will spur new interest in this research direction.

Usage

  1. Install dependencies

    pip install -r requirements.txt
  2. Download YouTube-VOS and DAVIS-2017 dataset. There is no need of pre-processing.

    Dependencies

Train

Test and evaluation

Results

Comparison with other methods on DAVIS-2017 Results on Youtube-VOS and Generalization ability
Video segmentation results on DAVIS-2017