IDEA-Research / Click-Pose

[ICCV 2023] Official implementation of the paper "Neural Interactive Keypoint Detection"
Other
74 stars 3 forks source link
annotation-tool human-in-the-loop iccv2023 pose-estimation

Neural Interactive Keypoint Detection

This is the official pytorch implementation of our ICCV 2023 paper "Neural Interactive Keypoint Detection."

Jie Yang, Ailing Zeng, Feng Li, Shilong Liu, Ruimao Zhang, Lei Zhang

Keywords: πŸ‘― Multi-person 2D pose estimation, πŸ’ƒ Human-in-the-loop, 🀝Interactive model

❀️ Highlights

πŸ’™ Click-Pose

πŸš€ Model Zoo

1. Model-Only Results

COCO val2017 set

Model Backbone Lr schd mAP AP50 AP75 APM APL Time (ms) Model
ED-Pose ResNet-50 60e 71.7 89.7 78.8 66.2 79.7 51 GitHub, Model
Click-Pose ResNet-50 40e 73.0 90.4 80.0 68.1 80.5 48 Google Drive

Human-Art val set

Model Backbone mAP APM APL Model
ED-Pose ResNet-50 37.5 7.6 41.1 GitHub, Model
Click-Pose ResNet-50 40.5 8.3 44.2 Google Drive

OCHuman test set

Model Backbone mAP AP50 AP75 Model
ED-Pose ResNet-50 31.4 39.5 35.1 GitHub, Model
Click-Pose ResNet-50 33.9 43.4 37.5 Google Drive

Note that the model is trained on COCO train2017 set and tested on COCO val2017 set, Human-Art val set, and OCHuman test set.

2. Neural Interactive Results

In-domain Annotation (COCO val2017)

Model Backbone NoC@85 NoC@90 NoC@95 Model
ViTPose ViT-Huge 1.46 2.15 2.87 GitHub, Model
Click-Pose ResNet-50 0.95 1.48 1.97 Google Drive

Out-of-domain Annotation (Human-Art val)

Model Backbone NoC@85 NoC@90 NoC@95 Model
ViTPose ViT-Huge 9.12 9.79 10.13 GitHub, Model
Click-Pose ResNet-50 4.82 5.81 6.45 Google Drive

πŸ”¨ Environment Setup

Installation We use the [ED-Pose](https://github.com/IDEA-Research/ED-Pose) as our codebase. We test our models under ```python=3.7.3,pytorch=1.9.0,cuda=11.1```. Other versions might be available as well. 1. Clone this repo ```sh git clone https://github.com/IDEA-Research/Click-Pose.git cd Click-Pose ``` 2. Install Pytorch and torchvision Follow the instruction on https://pytorch.org/get-started/locally/. ```sh # an example: conda install -c pytorch pytorch torchvision ``` 3. Install other needed packages ```sh pip install -r requirements.txt ``` 4. Compiling CUDA operators ```sh cd models/clickpose/ops python setup.py build install # unit test (should see all checking is True) python test.py cd ../../.. ```
Data Preparation **For COCO data**, please download from [COCO download](http://cocodataset.org/#download). The coco_dir should look like this: ``` |-- Click-Pose `-- |-- coco_dir `-- |-- annotations | |-- person_keypoints_train2017.json | `-- person_keypoints_val2017.json `-- images |-- train2017 | |-- 000000000009.jpg | |-- 000000000025.jpg | |-- 000000000030.jpg | |-- ... `-- val2017 |-- 000000000139.jpg |-- 000000000285.jpg |-- 000000000632.jpg |-- ... ``` **For Human-Art data**, please download from [Human-Art download](https://github.com/IDEA-Research/HumanArt), The humanart_dir should look like this: ``` |-- Click-Pose `-- |-- humanart_dir `-- |-- annotations | |-- training_humanart.json | |-- validation_humanart.json `-- images |-- 2D_virtual_human |-- ... |-- 3D_virtual_human |-- ... |-- real_human |-- ... ``` **For CrowdPose data**, please download from [CrowdPose download](https://github.com/Jeff-sjtu/CrowdPose#dataset), The crowdpose_dir should look like this: ``` |-- Click-Pose `-- |-- crowdpose_dir `-- |-- json | |-- crowdpose_train.json | |-- crowdpose_val.json | |-- crowdpose_trainval.json (generated by util/crowdpose_concat_train_val.py) | `-- crowdpose_test.json `-- images |-- 100000.jpg |-- 100001.jpg |-- 100002.jpg |-- 100003.jpg |-- 100004.jpg |-- 100005.jpg |-- ... ``` **For OCHuman data**, please download from [OCHuman download](https://github.com/liruilong940607/OCHumanApi). The ochuman_dir should look like this: ``` |-- Click-Pose `-- |-- ochuman_dir `-- |-- annotations `-- images ```

πŸ₯³ Run

Train on COCO:

Model-Only ``` export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir python -m torch.distributed.launch --nproc_per_node=4 main.py \ --output_dir "logs/ClickPose_Model-Only" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \ --dataset_file="coco" ```
Neural Interactive ``` export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir python -m torch.distributed.launch --nproc_per_node=4 main.py \ --output_dir "logs/ClickPose_Neural_Interactive" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \ --dataset_file="coco" ```

Evaluation on COCO:

Model-Only ``` export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir python -m torch.distributed.launch --nproc_per_node=4 main.py \ --output_dir "logs/ClickPose_Model-Only_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \ --dataset_file="coco" \ --pretrain_model_path "./models/ClickPose_model_only_R50.pth" \ --eval ```
Neural Interactive-NoC metric ``` export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir export CLICKPOSE_NoC_Test="TRUE" export CLICKPOSE_SAVE_PATH = "./NoC_95_coco.json" export NoC_thr = 0.95 python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="coco" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval ```
Neural Interactive-AP metric ``` export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir export CLICKPOSE_NoC_Test="TRUE" for CLICKPOSE_Click_Number in {1..17} do python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="coco" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval done ```

Evaluation on Human-Art:

Model-Only ``` export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir python -m torch.distributed.launch --nproc_per_node=4 main.py \ --output_dir "logs/ClickPose_Model-Only_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \ --dataset_file="humanart" \ --pretrain_model_path "./models/ClickPose_model_only_R50.pth" \ --eval ```
Neural Interactive-NoC metric ``` export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir export CLICKPOSE_NoC_Test="TRUE" export CLICKPOSE_SAVE_PATH = "./NoC_95_humanart.json" export NoC_thr = 0.95 python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="humanart" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval ```
Neural Interactive-AP metric ``` export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir export CLICKPOSE_NoC_Test="TRUE" for CLICKPOSE_Click_Number in {1..17} do python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="humanart" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval done ```

Evaluation on OCHuman:

Model-Only ``` export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir python -m torch.distributed.launch --nproc_per_node=4 main.py \ --output_dir "logs/ClickPose_Model-Only_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \ --dataset_file="ochuman" \ --pretrain_model_path "./models/ClickPose_model_only_R50.pth" \ --eval ```
Neural Interactive-NoC metric ``` export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir export CLICKPOSE_NoC_Test = "TRUE" export CLICKPOSE_SAVE_PATH = "./NoC_95_ochuman.json" export NoC_thr = 0.95 python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="ochuman" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval ```
Neural Interactive-AP metric ``` export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir export CLICKPOSE_NoC_Test="TRUE" for CLICKPOSE_Click_Number in {1..17} do python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \ --output_dir "logs/ClickPose_Neural_Interactive_eval" \ -c config/clickpose.cfg.py \ --options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \ --dataset_file="ochuman" \ --pretrain_model_path "./models/ClickPose_interactive_R50.pth" \ --eval done ```

Cite Click-Pose

If you find this repository useful for your work, please consider citing it as follows:

@inproceedings{yang2023neural,
  title={Neural Interactive Keypoint Detection},
  author={Yang, Jie and Zeng, Ailing and Li, Feng and Liu, Shilong and Zhang, Ruimao and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15122--15132},
  year={2023}
}
@inproceedings{yang2022explicit,
  title={Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation},
  author={Yang, Jie and Zeng, Ailing and Liu, Shilong and Li, Feng and Zhang, Ruimao and Zhang, Lei},
  booktitle={The Eleventh International Conference on Learning Representations},
  year={2022}
}