vladan-stojnic / ZLaP

Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)
MIT License
34 stars 3 forks source link

Label Propagation for Zero-shot Classification with Vision-Language Models

This repository contains the code for the paper Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias, "Label Propagation for Zero-shot Classification with Vision-Language Models", In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Setup

This code was implemented using Python 3.10.4 and the following dependencies:

torch==2.0.1
torchvision==0.15.2
cupy==11.4.0
faiss==1.7.3
numpy==1.22.3

Features

Pre-extracted features used in this work can be downloaded from here.

Running

The provided code can be run using

python zlap.py --help
usage: ZLaP [-h]
            [--dataset {imagenet,dtd,eurosat,fgvca,flowers,food101,pets,sun397,cars,caltech101,cifa10,cifar100,cub,ucf101,coco}]
            [--backbone {RN50_openai,ViT-B-16_openai,ViT-B-16_laion2b_s34b_b88k,ViT-H-14_laion2b_s32b_b79k,ViT-L-14-336_openai,ViT-L-14_openai,albef,blip,eva-clip-8b,eva-clip-18b}]
            [--k K] [--gamma GAMMA] [--alpha ALPHA]
            [--setup {transductive,inductive,sparse-inductive}]
            [--clf_type {text,proxy,cupl-text,cupl-proxy}]
python zlap.py --dataset imagenet --backbone RN50_openai --setup transductive --clf_type text --k 5 --gamma 5 --alpha 0.3
python zlap.py --dataset cub --backbone ViT-B-16_openai --setup inductive --clf_type proxy --k 10 --gamma 3 --alpha 0.3
python zlap.py --dataset dtd --backbone RN50_openai --setup sparse-inductive --clf_type text --k 5 --gamma 5 --alpha 0.3
python zlap.py --dataset imagenet --backbone ViT-B-16_openai --setup transductive --clf_type cupl-text --k 5 --gamma 5 --alpha 0.3

Citation

@InProceedings{Stojnic_2024_CVPR,
    author    = {Stojni\'c, Vladan and Kalantidis, Yannis and Tolias, Giorgos},
    title     = {Label Propagation for Zero-shot Classification with Vision-Language Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2024}
}