This repository is the code of our paper "Conditional Prototype Rectification Prompt Learning".
To run our code you need to install Dassl and the basic torch environment.
We suggest putting all datasets under the same folder (say $DATA
) to ease management and following the instructions below to organize datasets to avoid modifying the source code. The file structure looks like
$DATA/
|–– imagenet/
|–– caltech-101/
|–– oxford_pets/
|–– stanford_cars/
If you have some datasets already installed somewhere else, you can create symbolic links in $DATA/dataset_name
that point to the original data to avoid duplicate download.
The instructions to prepare each dataset are detailed below. To ensure reproducibility and fair comparison for future work, we utilize CoOp-style train/val/test splits for all datasets except ImageNet where the validation set is used as test set.
imagenet/
under $DATA
.images/
under imagenet/
.$DATA/imagenet/images
. The directory structure should look like
imagenet/
|–– images/
| |–– train/ # contains 1,000 folders like n01440764, n01443537, etc.
| |–– val/
$DATA/imagenet/images
.classnames.txt
to $DATA/imagenet/
from this link. The class names are copied from CLIP.caltech-101/
under $DATA
.101_ObjectCategories.tar.gz
from http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz and extract the file under $DATA/caltech-101
.split_zhou_Caltech101.json
from this link and put it under $DATA/caltech-101
. The directory structure should look like
caltech-101/
|–– 101_ObjectCategories/
|–– split_zhou_Caltech101.json
oxford_pets/
under $DATA
.split_zhou_OxfordPets.json
from this link. The directory structure should look like
oxford_pets/
|–– images/
|–– annotations/
|–– split_zhou_OxfordPets.json
stanford_cars/
under $DATA
.split_zhou_StanfordCars.json
from this link.The directory structure should look like
stanford_cars/
|–– cars_test\
|–– cars_test_annos_withlabels.mat
|–– cars_train\
|–– devkit\
|–– split_zhou_StanfordCars.json
oxford_flowers/
under $DATA
.cat_to_name.json
from here. split_zhou_OxfordFlowers.json
from here.The directory structure should look like
oxford_flowers/
|–– cat_to_name.json
|–– imagelabels.mat
|–– jpg/
|–– split_zhou_OxfordFlowers.json
food-101.tar.gz
under $DATA
, resulting in a folder named $DATA/food-101/
.split_zhou_Food101.json
from here.The directory structure should look like
food-101/
|–– images/
|–– license_agreement.txt
|–– meta/
|–– README.txt
|–– split_zhou_Food101.json
fgvc-aircraft-2013b.tar.gz
and keep only data/
.data/
to $DATA
and rename the folder to fgvc_aircraft/
.The directory structure should look like
fgvc_aircraft/
|–– images/
|–– ... # a bunch of .txt files
sun397/
under $DATA
.$DATA/sun397/
.split_zhou_SUN397.json
from this link.The directory structure should look like
sun397/
|–– SUN397/
|–– split_zhou_SUN397.json
|–– ... # a bunch of .txt files
$DATA
. This should lead to $DATA/dtd/
.split_zhou_DescribableTextures.json
from this link.The directory structure should look like
dtd/
|–– images/
|–– imdb/
|–– labels/
|–– split_zhou_DescribableTextures.json
eurosat/
under $DATA
.$DATA/eurosat/
.split_zhou_EuroSAT.json
from here.The directory structure should look like
eurosat/
|–– 2750/
|–– split_zhou_EuroSAT.json
ucf101/
under $DATA
.UCF-101-midframes.zip
from here and extract it to $DATA/ucf101/
. This zip file contains the extracted middle video frames.split_zhou_UCF101.json
from this link.The directory structure should look like
ucf101/
|–– UCF-101-midframes/
|–– split_zhou_UCF101.json
For few-shot learning tasks, you need to set base2new to False in main.py and modify backbone to RN50 in yaml to run code in the following format:
python main.py --config ./configs/fgvc.yaml --shots 1 --model CPR --subsample all
For few-shot learning tasks, you need to set base2new to False in main.py and modify backbone to ViT in yaml to run code in the following format:
python main.py --config ./configs/fgvc.yaml --shots 16 --model CPR --subsample base