chou141253 / FGVC-PIM

Pytorch implementation for "A Novel Plug-in Module for Fine-Grained Visual Classification". fine-grained visual classification task.
MIT License
186 stars 39 forks source link
efficientnet fgvc fine-grained-visual-categorization resnet swin-transformer vision-transformer

A Novel Plug-in Module for Fine-grained Visual Classification

PWC

PWC

paper url: https://arxiv.org/abs/2202.03822

We propose a novel plug-in module that can be integrated to many common backbones, including CNN-based or Transformer-based networks to provide strongly discriminative regions. The plugin module can output pixel-level feature maps and fuse filtered features to enhance fine-grained visual classification. Experimental results show that the proposed plugin module outperforms state-ofthe-art approaches and significantly improves the accuracy to 92.77% and 92.83% on CUB200-2011 and NABirds, respectively.

framework

1. Environment setting

// We move old version to ./v0/

1.0. Package

1.1. Dataset

In this paper, we use 2 large bird's datasets to evaluate performance:

1.2. Our pretrained model

1.3. OS

2. Train

(more information: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html)

2.1. data

train data and test data structure:

├── tain/
│   ├── class1/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   ├── class2/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   └── ....
└──

2.2. configuration

you can directly modify yaml file (in ./configs/)

2.3. run

python main.py --c ./configs/CUB200_SwinT.yaml

model will save in ./records/{project_name}/{exp_name}/backup/

2.4. about costom model

Building model refers to ./models/builder.py
More detail in how_to_build_pim_model.ipynb

2.5. multi-gpus

comment out main.py line 66

model = torch.nn.DataParallel(model, device_ids=None)

2.6. automatic mixed precision (amp)

use_amp: True, training time about 3-hours.
use_amp: False, training time about 5-hours.

3. Evaluation

If you want to evaluate our pretrained model (or your model), please give provide configs/eval.yaml (or costom yaml file is fine)

3.1. please check yaml

set yaml (configuration file) Key Value Description
train_root ~ set value to ~ (null) means this is not in training mode.
val_root ../data/eval/ path to validation samples
pretrained ./pretrained/best.pt pretrained model path

../data/eval/ folder structure:

├── eval/
│   ├── class1/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   ├── class2/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   └── ....
└──

3.2. run

python main.py --c ./configs/eval.yaml

results will show in terminal and been save in ./records/{project_name}/{exp_name}/eval_results.txt

4. HeatMap

python heat.py --c ./configs/CUB200_SwinT.yaml --img ./vis/001.jpg --save_img ./vis/001/

visualization visualization2

5. Infer

If you want to reason your picture and get the confusion matrix, please give provide configs/eval.yaml (or costom yaml file is fine)

5.1. please check yaml

set yaml (configuration file) Key Value Description
train_root ~ set value to ~ (null) means this is not in training mode.
val_root ../data/eval/ path to validation samples
pretrained ./pretrained/best.pt pretrained model path

../data/eval/ folder structure:

├── eval/
│   ├── class1/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   ├── class2/
│   |   ├── img001.jpg
│   |   ├── img002.jpg
│   |   └── ....
│   └── ....
└──

5.2. run

python infer.py --c ./configs/eval.yaml

results will show in terminal and been save in ./records/{project_name}/{exp_name}/infer_results.txt


Acknowledgment