IRVLUTD / HO-Cap

A Python package that provides evaluation and visualization tools for the HOCap dataset
https://irvlutd.github.io/HOCap
GNU General Public License v3.0
5 stars 1 forks source link
hand-object-interaction hand-pose object-pose

HO-Cap Toolkit

Python ROS Pytorch License

The HO-Cap Toolkit is a Python package that provides evaluation and visualization tools for the HO-Cap dataset.

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

Jikai Wang, Qifan Zhang, Yu-Wei Chao, Bowen Wen, Xiaohu Guo, Yu Xiang

[ arXiv ] [ Project page ]

hocap-demo-video


Contents

News

BibTeX Citation

If HO-Cap helps your research, please consider citing the following:

@misc{wang2024hocap,
      title={HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction}, 
      author={Jikai Wang and Qifan Zhang and Yu-Wei Chao and Bowen Wen and Xiaohu Guo and Yu Xiang},
      year={2024},
      eprint={2406.06843},
      archivePrefix={arXiv},
      primaryClass={id='cs.CV' full_name='Computer Vision and Pattern Recognition' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.'}
}

License

HOCap Toolkit is released under the GNU General Public License v3.0.

Installation

This code is tested with Python 3.10 and CUDA 11.8 on Ubuntu 20.04. Make sure CUDA 11.8 is installed on your system before running the code.

  1. Clone the HO-Cap repository from GitHub.

    git clone --rescursive git@github.com:IRVLUTD/HO-Cap.git
  2. Change the current directory to the cloned repository.

    cd HO-Cap
  3. Create conda environment

    conda create -n hocap-toolkit python=3.10
  4. Activate conda environment

    conda activate hocap-toolkit
  5. Install hocap-toolkit package and dependencies

    # Install dependencies
    python -m pip install --no-cache-dir -r requirements.txt
    
    # Build meshsdf_loss
    bash build.sh
    
    # Install hocap-toolkit
    python -m pip install -e .
  6. Download models for external libraries

    bash download_models.sh
  7. Download MANO models and code (mano_v1_2.zip) from the MANO website and place the extracted .pkl files under config/ManoModels directory. The directory should look like this:

    ./config/ManoModels
    ├── MANO_LEFT.pkl
    └── MANO_RIGHT.pkl

Download the HO-Cap Dataset

  1. Run below code to download the whole dataset:

    python dataset_downloader.py --subject_id all
  2. Or you can download the dataset for a specific subject:

    python dataset_downloader.py --subject_id 1
  3. The downloaded .zip files will be extracted to the ./data directory. And the directory should look like this:

    ./data
    ├── calibration
    ├── models
    ├── subject_1
    │   ├── 20231025_165502
    │   ├── ...
    ├── ...
    └── subject_9
      ├── 20231027_123403
      ├── ...

Loading Dataset and Visualizing Samples

  1. Below example shows how to visualize the pose annotations of one frame:

    python examples/sequence_pose_viewer.py

    sequence_pose_viewer

  2. Below example shows how to visualize sequence by the interactive 3D viewer:

    python examples/sequence_3d_viewer.py

    sequence_3d_viewer

  3. Below example shows how to offline render the sequence:

    python examples/sequence_renderer.py

    This will render the color image and segmentation map for all the frames in the sequence. The rendered images will be saved in the <sequence_folder>/renders/ directory.

    sequence_renderer_color sequence_renderer_mask

Evaluation

HO-Cap provides the benchmark evaluation for three tasks:

Run below code to download the example evaluation results:

python config/benchmarks/benchmark_downloader.py

If the evaluation results are saved in the same format, the evaluation codes below can be used to evaluate the results.

Hand Pose Estimation Evaluation

Novel Object Pose Estimation Evaluation

Novel Object Detection Evaluation

[^1]: A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image [^2]: Reconstructing Hands in 3D with Transformers [^3]: MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [^4]: FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [^5]: CNOS: A Strong Baseline for CAD-based Novel Object Segmentation [^6]: Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection