yuanli2333 / Hadamard-Matrix-for-hashing

CVPR2020/TNNLS2023: Central Similarity Quantization/Hashing for Efficient Image and Video Retrieval
MIT License
233 stars 46 forks source link
hadamard-matrix hash-centers

Codes for paper: Central Similarity Quantization for Efficient Image and Video Retrieval, arxiv

We release all codes and configurations for image hashing.

Update: Video hashing has been updated in here

Prerequisties

Ubuntu 16.04

NVIDIA GPU + CUDA and corresponidng Pytorch framework (v0.4.1)

Python 3.6

Datasets

  1. Download database for the retrieval list of imagenet in the anonymous link here, and put database.txt in 'data/imagenet/'

  2. Download MS COCO, ImageNet2012, NUS_WIDE in their official website: COCO, ImageNet, NUS_WIDE. Unzip all data and put in 'data/dataset_name/'.

Hash center (target)

Here, we put hash centers for imagenet we used in 'data/imagenet/hashcenters'. The methods to generate hash centers are given in the tutorial: [Tutorial hash_centergeneration.ipynb](https://github.com/yuanli2333/Hadamard-Matrix-for-hashing/blob/master/Tutorial%20hash_center_generation.ipynb)

Test

Pretrained models are Google Drive, or you can directly download it from the release.

It will take a long time to generate hash codes for database, because of the large-scale data size for database

Test for imagenet:

Download pre-trained model 'imagenet_64bit_0.8734_resnet50.pkl' for imagenet, put it in 'data/imagenet/', then run:

python test.py --data_name imagenet --gpus 0,1  --R 1000  --model_name 'imagenet_64bit_0.8734_resnet50.pkl' 

Test for coco:

Download pre-trained model 'coco_64bit_0.8612_resnet50.pkl' for coco, put it in 'data/coco/', then run:

python test.py --data_name coco --gpus 0,1  --R 5000  --model_name 'coco_64bit_0.8612_resnet50.pkl' 

Test for nus_wide:

Download pre-trained model 'nus_wide_64bit_0.8391_resnet50.pkl' for nus_wide, put it in 'data/nus_wide/', then run:

python test.py --data_name nus_wide --gpus 0,1  --R 5000  --model_name 'nus_wide_64bit_0.8391_resnet50.pkl' 

The MAP of retrieval on the three datasets are shown in the following:

Dataset MAP(16bit) MAP(32bit) MAP(16bit)
ImageNet 0.851 0.865 0.873
MS COCO 0.796 0.838 0.861
NUS WIDE 0.810 0.825 0.839

Train

Train on imagenet, hash bit: 64bit

Trained model will be saved in 'data/imagenet/models/'

python train.py --data_name imagenet --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --R 1000

Train on coco, hash bits: 64bit

Trained model will be saved in 'data/coco/models/'

python train.py --data_name coco --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05 --multi_lr 0.05  --R 5000

Train on nus_wide, hash bit: 64bit

Trained model will be saved in 'data/nus_wide/models/'

python train.py --data_name nus_wide --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --multi_lr 0.05 --R 5000

AlexNet as backbone.

Pretrained models of AlexNet are here. Pre-trained models for COCO will be given in the future

The MAP of retrieval on ImageNet and NUS_WIDE are shown in the following:

Dataset MAP(16bit) MAP(32bit) MAP(64bit)
ImageNet 0.601 0.653 0.695
NUS_WIDE 0.744 0.785 0.789

Train on ImageNet, 16bit

python train.py --data_name imagenet --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 32bit

python train.py --data_name imagenet --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 64bit

python train.py --data_name imagenet --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.0001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 16bit

python train.py --data_name nus_wide --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 32bit

python train.py --data_name nus_wide --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 64bit

python train.py --data_name nus_wide --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Reference

If you find this repo useful, please consider citing:

@inproceedings{yuan2020central,
  title={Central Similarity Quantization for Efficient Image and Video Retrieval},
  author={Yuan, Li and Wang, Tao and Zhang, Xiaopeng and Tay, Francis EH and Jie, Zequn and Liu, Wei and Feng, Jiashi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3083--3092},
  year={2020}
}