While recent progress has significantly boosted few-shot classification (FSC) performance, few-shot object detection (FSOD) remains challenging for modern learning systems. Therefore, we propose DAnA (Dual-awareness Attention) mechanism which is adaptable to various existing object detection networks and enhances FSOD performance by paying adaptable attention to support images conditioned on given query information. The proposed method achieves SOTA results on COCO benchmark, outperforming the strongest baseline by 47% on performance.\ paper link: https://arxiv.org/abs/2102.12152
cd Dual-awareness-Attention-for-Few-shot-Object-Detection && mkdir data
$ cd data
For VOC 2007 $ ln -s [your-path-to]/VOC2007/VOCdevkit VOCdevkit2007
For COCO $ ln -s [your-path-to]/coco coco
3. The COCO dataset must be preprocessed to conform to the problem setting of FSOD. At training, we must remove the labels of novel instances in each query image. For testing, we should fix the target category of each query image to ensure the results are reproducible. For your convenience, we provide the preprocessed .json files of COCO for both training and testing. Users can process the COCO annotation to construct customized datasets for their research purposes as well.
* 60 base classes for training (https://drive.google.com/file/d/10mXvdpgSjFYML_9J-zMDLPuBYrSrG2ub/view?usp=sharing)
* 20 novel classes for testing (https://drive.google.com/file/d/1FZJhC-Ob-IXTKf5heNeNAN00V8OUJXi2/view?usp=sharing)
To use them, simply put the folder into *COCO annotations*.
$ mv coco60_train [yout-path-to]/coco/annotations/coco60_train
For those who want to apply customized annotations, please refer to lib/datasets/factory.py and lib/datasets/coco_split.py.
4. At training, the support images are image patches randomly cropped from other query images according to box annotations. At testing, to ensure the results are reproducible, a set of support images of 80 categories should be constructed in advance. The support image set we used is available [here](https://drive.google.com/file/d/1nl9-DEpBBJ5w6hxVdijY6hFxoQdz8aso/view?usp=sharing). To use them:
Create the soft link of support imgs $ ln -s /your/path/to/supports supports
5. Create the folder to save model weights
$ mkdir models
### Pretrained Weights
Please download the pretrained backbone models (e.g., res50, vgg16) and put them into data/pretrained_model, which can be found in [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn#beyond-the-demo-installation-for-training-and-testing-models).
$ mkdir data/pretrained_model && cd data/pretrained_model $ ln -s /your/path/to/res50.pth res50.pth
**NOTE**. We would suggest to use Caffe pretrained models to reproduce our results.
**If you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data transformer (minus mean and normalize) as used in pretrained model.**
For those who would like to test the model only, the weights of DAnA can be download [here](https://drive.google.com/file/d/1JaYF-Ep-C6b5X01_e9tFRzFgRXMJQYQ7/view?usp=sharing). **NOTE**. The provided fine-tuned model weights "cisa_coco_ft30" was fine-tuned on 30-shot novel object classes without using BA block. Therefore, to use them, please set get_model(..., use_BA_block=False) at train.py.
$ cd models $ ln -s [your-path-to]/cisa_coco_ft30 cisa_coco_ft30
### Installation
Install the conda environment.
$ conda env create -f env.yml $ source activate [NAME_OF_THE_ENV]
Compile COCO API.
$ cd lib $ git clone https://github.com/pdollar/coco.git $ cd coco/PythonAPI $ make && make install put pycocotools under data/ $ mv cocoapi/PythonAPI/pycocotools .
Compile the cuda dependencies using following commands.
$ cd lib $ python setup.py build develop
If you are confronted with error during the compilation, you might miss to export the CUDA paths to your environment.
## Train
***To train from scratch***
$ python train.py --dataset coco_base --flip --net DAnA --lr 0.001 --lr_decay_step 12 --bs 4 --epochs 16 --disp_interval 20 --save_dir models/DAnA --way 2 --shot 3
***To resume***
$ python train.py --dataset coco_base --flip --net DAnA --lr 0.001 --lr_decay_step 12 --bs 4 --epochs 16 --disp_interval 20 --save_dir models/DAnA --way 2 --shot 3 --r --load_dir models/DAnA --checkepoch 12 --checkpoint 4307
<!-- ***To fine-tune***
<br>
The same as continuing training. You can simply replace the dataset with fine-tuning dataset prepared beforehand and select a smaller learning rate like 0.0001.
The json files of the fine-tuning datasets comprised of 20 novel COCO categories can be found here:
<br>
[10 shots](https://drive.google.com/file/d/1eUZpc6KpSouZm8QL5s2EHXi4sXJXn89X/view?usp=sharing)
<br>
[30 shots](https://drive.google.com/file/d/1zCj4Sbhd2FjHVxlmlk4Nfb9qqJcqMHRy/view?usp=sharing) -->
## Inference
$ python inference.py --eval --dataset val2014_novel --net DAnA --r --load_dir models/DAnA_coco_ft30 --checkepoch 4 --checkpoint 299 --bs 1 --shot 3 --eval_dir dana
## Attention Visualization
<br />
<p align="center">
<a href="https://github.com/Tung-I/Dual-awareness-Attention-for-Few-shot-Object-Detection
">
<img src="https://github.com/Tung-I/Dual-awareness-Attention-for-Few-shot-Object-Detection/raw/main/images/attention_visualization.jpg" alt="attention_visualization" width="1024" height="280">
</a>
</p>
## Acknowledgements
This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 110-2634-F-002-026. We benefit from NVIDIA DGX-1 AI Supercomputer and are grateful to the National Center for High-performance Computing. The code is mainly build on [faster-rcnn.pytorch](https://github.com/jwyang/faster-rcnn.pytorch/tree/pytorch-1.0).