Update (11/04/2023): Release the evaluation code on FreiHAND datasets.
:loudspeaker: Update (10/09/2023): Our paper "CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting" has been accepted at ACM MM 2023! Stay tuned for more updates. :tada:
CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting
Shaoxiang Guo, Qing Cai*, Lin Qi and Junyu Dong* (*Corresponding Authors)
School of Computer Science and Technology, Ocean University of China, 238 Songling Road, Qingdao, China.
In our paper, we introduce CLIP-Hand3D, a novel method for 3D hand pose estimation from monocular images using Contrastive Language-Image Pre-training (CLIP). We bridge the gap between text prompts and the irregular distribution of hand joint positions in 3D space by encoding pose labels into text representations and hand joint spatial distribution into pose-aware features. We maximize the semantic consistency between pose-text features using a CLIP-based contrastive learning paradigm. Our method, which includes a coarse-to-fine mesh regressor, achieves comparable SOTA performance and significantly faster inference speed on several public hand benchmarks. In this github repository, we will release the corresponding codes. First, we release a simple zero-shot demo to show the semantic relations between hand images and pose text prompts.
Environment
conda create -n cliphand python=3.9
conda activate cliphand
Package Requirements
pip install -r requirements.txt
Install CLIP Please follow CLIP links to install CLIP module.
Download Pre-train Weights Google Driver
FreiHAND Dataset
Run the following code
python eval/eval_frei.py
Remember to replace the corresponding file path in your paths.
evaluation(
model_weight_path='/your/path/to/CLIP_Hand_3D_PE_0604_44.pth.tar',
Vertx_dict_path='/your/path/to/vertices.npy',
face_path='/your/path/to/right_faces.npy',
)
Put Pre-train Weights on your Ubuntu Disk, and record its location.
/home/anonymous/CLIP_HAND_3D_0402.pth.tar
Extract the FreiHAND dataset to Ubuntu Disk and record its location.
/home/anonymous/FreiHAND
Modify Parameters in demo/main.py
WEIGHT_PATH = YOUR_WEIGHT_PATH
DATASET_PATH = YOUR_FreiHAND_PATH
BATCH_SIZE = YOUR_BS
Run and Output
python main.py
If you are interested in our work or if it is helpful to you, please consider citing our paper:
@inproceedings{guo_clip_hand,
title={CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting},
author={Guo, Shaoxiang and Cai, Qing and Qi, Lin and Dong, Junyu},
booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
year={2023},
organization={ACM}
}
We also referenced the following codebase, which inspired us with their outstanding work. We would like to thank the authors of the following code bases.