This repository contains the implementation of the following paper:
Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset
Jing Linπ12, Ailing Zengππ€1, Shunlin Luπ13, Yuanhao Cai2, Ruimao Zhang3, Haoqian Wang2, Lei Zhang1
πEqual contribution. π€Corresponing author.1International Digital Economy Academy 2Tsinghua University 3The Chinese University of Hong Kong, Shenzhen
tomato
representation) and split files.Stay tuned!
π Table of Contents
We propose a high-accuracy and efficient annotation pipeline for whole-body motions and the corresponding text labels. Based on it, we build a large-scale 3D expressive whole-body human motion dataset from massive online videos and eight existing motion datasets. We unify them into the same formats, providing whole-body motion (i.e., SMPL-X) and corresponding text labels.
Labels from Motion-X:
15.6M
whole-body poses and 81.1K
motion clips annotation, represented as SMPL-X parameters. All motions have been unified in 30 fps.15.6M
frame-level whole-body pose description and (2) 81.1K
sequence-level semantic labels.Supported Tasks:
Dataset | Clip Number | Frame Number | Website | License | Downloading Link |
---|---|---|---|---|---|
AMASS | 26K | 5.4M | AMASS Website |
AMASS License |
AMASS Data |
EgoBody | 1.0K | 0.4M | EgoBody Website |
EgoBody License |
EgoBody Data |
GRAB | 1.3K | 0.4M | GRAB Website |
GRAB License |
GRAB Data |
IDEA400 | 12.5K | 2.6M | IDEA400 Website | IDEA400 License | IDEA400 Data |
AIST++ | 1.4K | 0.3M | AIST++ Website |
AIST++ License |
AIST++ Data |
HAA500 | 5.2K | 0.3M | HAA500 Website |
HAA500 License |
HAA500 Data |
HuMMan | 0.7K | 0.1M | HuMMan Website |
HuMMan License |
HuMMan Data |
BAUM | 1.4K | 0.2M | BAUM Website | BAUM License |
BAUM Data |
Online Videos | 32.5K | 6.0M | --- | --- | Online Data |
Motion-X (Ours) | 81.1K | 15.6M | Motion-X Website | Motion-X License | Motion-X Data |
We disseminate Motion-X in a manner that aligns with the original data sources. Here are the instructions:
Please fill out this form to request authorization to use Motion-X for non-commercial purposes. Then you will receive an email and please download the motion and text labels from the provided downloading links. The pose texts can be downloaded from here. Please extract the body_texts folder and hand_texts folder from the downloaded motionx_pose_text.zip.οΌNote: We updated the Baidu Disk link of motionx_seq_face_text.zip and motionx_face_motion.zip on October 29, 2023. Thus, if you download these zips via Baidu Disk before October 29, please fill out the form and download again.οΌ
../datasets
βββ motion_data
βββ smplx_322
βββ idea400
βββ ...
βββ face_motion_data
βββ smplx_322
βββ humanml
βββ EgoBody
βββ GRAB
βββ texts
βββ semantic_labels
βββ idea400
βββ ...
βββ face_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
βββ body_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
βββ hand_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
For the non-mocap subsets, please refer to this link for a detailed instruction, notably:
For the mocap datasets (i.e., AMASS, GRAB, EgoBody), please refer to this link for a detailed instruction, notably:
The AMASS and GRAB datasets are released for academic research under custom licenses by Max Planck Institute for Intelligent Systems. To download AMASS and GRAB, you must register as a user on the dataset websites and agree to the terms and conditions of each license:
https://amass.is.tue.mpg.de/license.html
https://grab.is.tuebingen.mpg.de/license.html
../datasets
βββ motion_data
βββ smplx_322
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
βββ texts
βββ semantic_labels
βββ idea400
βββ ...
βββ face_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
βββ body_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
βββ hand_texts
βββ humanml
βββ EgoBody
βββ GRAB
βββ idea400
βββ ...
To load the motion and text labels you can simply do:
import numpy as np
import torch
# read motion and save as smplx representation
motion = np.load('motion_data/smplx_322/000001.npy')
motion = torch.tensor(motion).float()
motion_parms = {
'root_orient': motion[:, :3], # controls the global root orientation
'pose_body': motion[:, 3:3+63], # controls the body
'pose_hand': motion[:, 66:66+90], # controls the finger articulation
'pose_jaw': motion[:, 66+90:66+93], # controls the yaw pose
'face_expr': motion[:, 159:159+50], # controls the face expression
'face_shape': motion[:, 209:209+100], # controls the face shape
'trans': motion[:, 309:309+3], # controls the global body position
'betas': motion[:, 312:], # controls the body shape. Body shape is static
}
# read text labels
semantic_text = np.loadtxt('semantic_labels/000001.npy') # semantic labels
We support the visualization from the camera space and world space, please refer to this guidance.
Our annotation pipeline significantly surpasses existing SOTA 2D whole-body models and mesh recovery methods.
If you find this repository useful for your work, please consider citing it as follows:
@article{lin2023motionx,
title={Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset},
author={Lin, Jing and Zeng, Ailing and Lu, Shunlin and Cai, Yuanhao and Zhang, Ruimao and Wang, Haoqian and Zhang, Lei},
journal={Advances in Neural Information Processing Systems},
year={2023}
}