By Kaiyu Yue, Ming Sun, Yuchen Yuan, Feng Zhou, Errui Ding and Fuxin Xu
This is a PyTorch re-implementation for the paper Compact Generalized Non-local Network. It brings the CGNL models trained on the CUB-200, ImageNet and COCO based on maskrcnn-benchmark from FAIR.
caffe/README.md
If you think this code is useful in your research or wish to refer to the baseline results published in our paper, please use the following BibTeX entry.
@article{CGNLNetwork2018,
author={Kaiyu Yue and Ming Sun and Yuchen Yuan and Feng Zhou and Errui Ding and Fuxin Xu},
title={Compact Generalized Non-local Network},
journal={NIPS},
year={2018}
}
The code is developed and tested under 8 Tesla P40 / V100-SXM2-16GB GPUS cards on CentOS with installed CUDA-9.2/8.0 and cuDNN-7.1.
File ID | Model | Best Top-1 (%) | Top-5 (%) | Google Drive | Baidu Pan |
---|---|---|---|---|---|
1832260500 | R-50 Base | 86.45 | 97.00 | link |
link |
1832260501 | R-50 w/ 1 NL Block | 86.69 | 96.95 | link |
link |
1832260502 | R-50 w/ 1 CGNL Block | 87.06 | 96.91 | link |
link |
1832261010 | R-101 Base | 86.76 | 96.91 | link |
link |
1832261011 | R-101 w/ 1 NL Block | 87.04 | 97.01 | link |
link |
1832261012 | R-101 w/ 1 CGNL Block | 87.28 | 97.20 | link |
link |
File ID | Model | Best Top-1 (%) | Top-5 (%) | Google Drive | Baidu Pan |
---|---|---|---|---|---|
1832260503x | R-50 w/ 1 CGNLx Block | 86.56 | 96.63 | link |
link |
1832261013x | R-101 w/ 1 CGNLx Block | 87.18 | 97.03 | link |
link |
File ID | Model | Best Top-1 (%) | Top-5 (%) | Google Drive | Baidu Pan |
---|---|---|---|---|---|
torchvision | R-50 Base | 76.15 | 92.87 | - | - |
1832261502 | R-50 w/ 1 CGNL Block | 77.69 | 93.63 | link |
link |
1832261503 | R-50 w/ 1 CGNLx Block | 77.32 | 93.40 | link |
link |
torchvision | R-152 Base | 78.31 | 94.06 | - | - |
1832261522 | R-152 w/ 1 CGNL Block | 79.53 | 94.52 | link |
link |
1832261523 | R-152 w/ 1 CGNLx Block | 79.37 | 94.47 | link |
link |
backbone | type | lr sched | im / gpu | train mem(GB) | train time (s/iter) | total train time(hr) | inference time(s/im) | box AP | mask AP | model id | Google Drive | Baidu Pan |
---|---|---|---|---|---|---|---|---|---|---|---|---|
R-50-C4 | Mask | 1x | 1 | 5.641 | 0.5434 | 27.3 | 0.18329 + 0.011 | 35.6 | 31.5 | 6358801 | - | - |
R-50-C4 w/ 1 CGNL Block | Mask | 1x | 1 | 5.868 | 0.5785 | 28.5 | 0.20326 + 0.008 | 36.3 | 32.1 | - | link |
link |
R-50-C4 w/ 1 CGNLx Block | Mask | s1x_C.SOLVER.WARMUP_ITERS = 20000 STEPS: (140000, 180000) MAX_ITER: 200000 |
1 | 5.977 | 0.5855 | 32.3 | 0.18571 + 0.010 | 36.2 | 31.9 | - | link |
link |
CGNL
/ CGNLx
/ NL
blocks to the backbone of Mask-RCNN models, you can use the maskrcnn-benchmark/modeling/backbone/resnet.py
and maskrcnn-benchmark/utils/c2_model_loading.py
to replace the original py-files. Please refer to the code for specific configurations.WARMUP_ITERS
appropriately would produce the better results for CGNL models. The long training schedule is also recommended, like 2x
or 1.44x
in Detectron.train time
, total train time
and inference time
in above table are both larger than the benchmarks. But this does not affect the demonstration of the efficiency of CGNL block.Download pytorch imagenet pretrained models from pytorch model zoo. The optional download links can be found in torchvision. Put them in the pretrained
folder.
Download the training and validation lists for CUB-200 dataset from Google Drive or Baidu Pan. Download the ImageNet dataset and move validation images to labeled subfolders following the tutorial. The training and validation lists can be found in Google Drive or Baidu Pan. Put them in the data
folder and make them look like:
${THIS REPO ROOT}
`-- pretrained
|-- resnet50-19c8e357.pth
|-- resnet101-5d3b4d8f.pth
|-- resnet152-b121ed2d.pth
`-- data
`-- cub
`-- images
| |-- 001.Black_footed_Albatross
| |-- 002.Laysan_Albatross
| |-- ...
| |-- 200.Common_Yellowthroat
|-- cub_train.list
|-- cub_val.list
|-- images.txt
|-- image_class_labels.txt
|-- README
`-- imagenet
`-- img_train
| |-- n01440764
| |-- n01734418
| |-- ...
| |-- n15075141
`-- img_val
| |-- n01440764
| |-- n01734418
| |-- ...
| |-- n15075141
|-- imagenet_train.list
|-- imagenet_val.list
$ python train_val.py --arch '50' --dataset 'cub' --nl-type 'cgnl' --nl-num 1 --checkpoints ${FOLDER_DIR} --valid
$ python train_val.py --arch '50' --dataset 'cub' --nl-num 0
$ python train_val.py --arch '50' --dataset 'cub' --nl-type 'cgnl' --nl-num 1 --warmup
This code is released under the MIT License. See LICENSE for additional details.