StevenGrove / vtpack

code base for vision transformers
Apache License 2.0
34 stars 3 forks source link

VTPACK

This repo is an official implementation for "Dynamic Grained Encoder for Vision Transformers" (NeurIPS2021) on PyTorch framework.

Installation

Requirements

Build from source

Prepare data

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Usage

Training

# Running training procedure with specific GPU number
./tools/run_dist_launch.sh <GPU_NUM> <path_to_config> [optional arguments]

# Please refer to main.py for more optional arguments

Inference

# Running inference procedure with specific GPU number and model path
./tools/run_dist_launch.sh <GPU_NUM> <path_to_config> --eval --resume <model_path> [optional arguments]

# Please refer to main.py for more optional arguments

Image Classification on ImageNet val set

The following models are trained and evaluated with 256 * 256 input images. The budget for DGE is 0.5. Method Acc1 Acc5 (%) MACavg Project Model
DeiT-Ti 73.2 91.8 1.7G Link GoogleDrive
DeiT-Ti + DGE 73.2 91.7 1.1G Link GoogleDrive
DeiT-S 80.6 95.4 6.1G Link GoogleDrive
DeiT-S + DGE 80.1 95.0 3.5G Link GoogleDrive

More models are comming soon.

Citation

Please cite the paper in your publications if it helps your research.

@inproceedings{song2021dynamic,
    title={Dynamic Grained Encoder for Vision Transformers},
    author={Song, Lin and Zhang, Songyang and Liu, Songtao and Li, Zeming and He, Xuming and Sun, Hongbin and Sun, Jian and Zheng, Nanning},
    booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
    year={2021}
}

Please cite this project in your publications if it helps your research.

@misc{vtpack,
    author = {Song, Lin},
    title = {VTPACK},
    howpublished = {\url{https://github.com/StevenGrove/vtpack}},
    year ={2021}
}