ltttpku / CMMP

16 stars 1 forks source link

[ECCV 2024] Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection

Arxiv, Project Page

Dataset

Follow the process of UPT.

The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.

|- CMMP
|   |- hicodet
|   |   |- hico_20160224_det
|   |       |- annotations
|   |       |- images
|   |- vcoco
|   |   |- mscoco2014
|   |       |- train2014
|   |       |-val2014
:   :      

Dependencies

  1. Follow the environment setup in UPT.

  2. Our code is built upon CLIP. Install the local package of CLIP:

    cd CLIP && python setup.py develop && cd ..
  3. Download the CLIP weights to checkpoints/pretrained_clip.

    |- CMMP
    |   |- checkpoints
    |   |   |- pretrained_clip
    |   |       |- ViT-B-16.pt
    |   |       |- ViT-L-14-336px.pt
    :   :      
  4. Download the weights of DETR and put them in checkpoints/.

Dataset DETR weights
HICO-DET weights
V-COCO weights
|- CMMP
|   |- checkpoints
|   |   |- detr-r50-hicodet.pth
|   |   |- detr-r50-vcoco.pth
:   :   :

Pre-extracted Features

Download the pre-extracted features from HERE and the pre-extracted bboxes from HERE. The downloaded files have to be placed as follows.

|- CMMP
|   |- hicodet_pkl_files
|   |   |- union_embeddings_cachemodel_crop_padding_zeros_vitb16.p
|   |   |- hicodet_union_embeddings_cachemodel_crop_padding_zeros_vit336.p
|   |- vcoco_pkl_files
|   |   |- vcoco_union_embeddings_cachemodel_crop_padding_zeros_vit16.p
|   |   |- vcoco_union_embeddings_cachemodel_crop_padding_zeros_vit336.p
:   :      

Train/Test

Please follow the commands in ./scripts.

Model Zoo

Method Type Unseen↑ Seen↑ Full↑ HM↑
CMMP (Ours) RF-UC 29.45 32.87 32.18 31.07
CMMP† (Ours) RF-UC 35.98 37.42 37.13 36.69
CMMP (Ours) NF-UC 32.09 29.71 30.18 30.85
CMMP† (Ours) NF-UC 33.52 35.53 35.13 34.50
CMMP (Ours) UO 33.76 31.15 31.59 32.40
CMMP† (Ours) UO 39.67 36.15 36.74 37.83
CMMP (Ours) UV 26.23 32.75 31.84 29.13
CMMP† (Ours) UV 30.84 37.28 36.38 33.75

Model Weights

You can download the model weights from:

Link: https://pan.baidu.com/s/1XyWG2qjEXWghEYcc4-PGFA?pwd=zkh5
Password: zkh5

Or you can download the CMMP weights from huggingface:

https://huggingface.co/lttt/CMMP/tree/main

Citation

If you find our paper and/or code helpful, please consider citing:

@article{ting2024CMMP,
  title={Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection},
  author={Ting Lei and Shaofeng Yin and Yuxin Peng and Yang Liu},
  year={2024},
  booktitle={ECCV},
  organization={IEEE},
}