Scaling Up Dynamic Human-Scene Interaction Modeling

This is the code repository of Scaling Up Dynamic Human-Scene Interaction Modeling at CVPR24 (highlight)

arXiv | Project Page | Dataset | Demo

News

Training code has been updated
🎉🎉🎉 The Action Annotation is released! 🎉🎉🎉 Download the full dataset from Google Drive and the action annotation is in the Actions folder, with start frame, end frame and textual description.
All the interactable objects in TRUMANS dataset are released.

Human Motion Synthesis in Editable Indoor Scenes

This project provides an implementation of our method for synthesizing human motion in an indoor scene, based on a user-defined trajectory. The furniture configuration within the scene can be edited freely by the user. This is achieved by using a Flask application to facilitate the interactive input and visualization.

Features

Editable Indoor Scene: Users can modify the furniture configuration within the indoor scene.
Trajectory Drawing: Users can draw a trajectory within the scene.
Human Motion Synthesis: The application generates human motion based on the drawn trajectory.

Getting Started

Prerequisites

To run the application, you need to have the following installed:

Python 3.x
Flask
Required Python packages (specified in requirements.txt)

Installation

Clone the Repository:

git clone https://github.com/jnnan/trumans_utils.git
cd trumans_utils

Download Checkpoints, Data, and SMPL-X Models:
- Download the necessary files and folders from this link.
- Extract trumans_demo.zip, and place the four folders at the root of the project directory (./trumans_utils).
Install Python Packages:
```
pip install -r requirements.txt
```

Running the Application

To start the Flask application:

python3 -m flask run --host=0.0.0.0

The application will be available at http://127.0.0.1:5000.

Usage

Open your web browser and navigate to http://127.0.0.1:5000.
You will see an interface where you can edit the indoor scene configuration.
Draw a trajectory within the scene.
The application will synthesize human motion based on the drawn trajectory and display it within the scene.

Training

Overview

This README provides instructions on setting up and training the TRUMANS model using the provided dataset.

Prerequisites

Before you begin, make sure you have the following software installed:

    pip install -r requirements.txt

Dataset Setup

Download the TRUMANS dataset from the provided link.
Place the dataset files in the following directory within your project:
```
./trumans/Data_release
```

Configuration

Set the ROOT_DIR environment variable to the absolute path of the ./trumans directory in your system. This can be done by adding the following line to your .bashrc or .bash_profile:

export ROOT_DIR='/absolute/path/to/trumans'

Make sure to replace /absolute/path/to/trumans with the actual path to the trumans folder on your machine.

Model Training

Navigate to the trumans directory:

cd trumans

To start training the model, run the training script from the command line:

python train_synhsi.py

The training script will automatically load the dataset from Data_release, set up the model, and commence training sessions using the configurations in ./trumans/config folder.

Annotation

The dataset includes an action_label.npy file containing frame-wise annotations for the motions. The labels correspond to the type of interaction and are indexed as follows:

Interaction Type	Label
Lie down	0
Squat	1
Mouse	2
Keyboard	3
Laptop	4
Phone	5
Book	6
Bottle	7
Pen	8
Vase	9

These labels are used during the training to provide supervision for the learning model. To visualize the generated motion according to the predefined labels, see the demo part and change line 81 of sample_hsi.py

TRUMANS Dataset

Please download the TRUMANS dataset from Google Drive. The content inside the download link will be continuously updated to ensure you have access to the most recent data.

Explanation of the files and folders of the TRUMANS dataset:

Action (folder): This folder contains the action annotation in language for each motion segment, indicated by each file name.
smplx_result (folder): SMPL-X annotation for each motion segment in global coordinate.
smplx_result_in_cam (folder): SMPL-X annotation for each motion segment in the camera coordinate of corresponding frames.
video_render (folder): RGB video rendering of each motion segment in 30 FPS.
Scene (folder): This folder contains the occupancy grids for the scenes in the TRUMANS dataset. The coverage range is from -3 to 3 meters along the x-axis, 0 to 2 meters along the y-axis, and -4 to 4 meters along the z-axis. The voxel shape is (300, 100, 400) in the x, y, and z axis order, using a y-up coordinate system.
Scene_mesh (folder): This folder includes the OBJ mesh files of the scenes.
Scene_occ_render (folder): This folder contains the top-down view renders of the scene occupancy grids.
Object_chairs:
- Object (folder): This folder holds the occupancy grids for the sittable objects in the TRUMANS dataset. The format is the same as that of the scenes.
- Object_mesh (folder): This folder contains the OBJ mesh files of the objects.
Object_all: All interactable objects.
- Object_pose: Object rotation (xyz Euler) and location (y-up coordinate system) for each motion segment.
- Object_mesh: OBJ mesh of the objects.
human_pose.npy: This file contains a (3,792,068 x 63) array, where each row corresponds to the 63-dimensional SMPL-X body_pose parameter of one frame of MoCap data. The data is a concatenation of all motion segments.
human_orient.npy: This file contains a (3,792,068 x 3) array corresponding to the global_orient parameter of SMPL-X.
human_transl.npy: This file contains a (3,792,068 x 3) array corresponding to the transl parameter of SMPL-X.
left_hand_pose.npy: Left hand pose in SMPL-X.
right_hand_pose.npy: Right hand pose in SMPL-X.
seg_name.npy: This file contains a (3,792,068,) string array, where each string represents the motion segment name of a particular frame.
frame_id.npy: This file contains a (3,792,068,) integer array, with each value representing the frame number of the original motion segment to which the current index belongs.
scene_list.npy: This is a list of scene names corresponding to the files in the "Scene" folder.
scene_flag.npy: This file contains a (3,792,068,) integer array. Each value indicates the scene name of the current frame by indexing the scene_list.npy.
object_list.npy: This is a list of object names corresponding to the files in the "Object" folder.
object_flag.npy: This file contains a (3,792,068 x 32) integer array. Each row has 32 values, indicating the presence of 32 objects for each frame index. A value of -1 means the corresponding object does not apply, while a value equal to or greater than 1 indicates the presence of the object in that frame. The value is linked to the object_mat.npy file.
object_mat.npy: This file contains a (3,792,068 x 5 x 4 x 4) array. The second dimension (with shape 5) corresponds to the 5 types of objects, whose existence is indicated in the object_flag.npy file. The last two dimensions represent the rotation and translation of the object in a 4x4 matrix.
bad_frames.npy: This is a list of frames that contain erroneous MoCap data.

Note: The data associated with action labels and 2D rendering will be uploaded soon.

Citation

@inproceedings{jiang2024scaling,
  title={Scaling up dynamic human-scene interaction modeling},
  author={Jiang, Nan and Zhang, Zhiyuan and Li, Hongjie and Ma, Xiaoxuan and Wang, Zan and Chen, Yixin and Liu, Tengyu and Zhu, Yixin and Huang, Siyuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1737--1747},
  year={2024}
}

jnnan / trumans_utils

readme

Scaling Up Dynamic Human-Scene Interaction Modeling

This is the code repository of Scaling Up Dynamic Human-Scene Interaction Modeling at CVPR24 (highlight)

arXiv | Project Page | Dataset | Demo

News

Human Motion Synthesis in Editable Indoor Scenes

Features

Getting Started

Prerequisites

Installation

Running the Application

Usage

Training

Overview

Prerequisites

Dataset Setup

Configuration

Model Training

Annotation

TRUMANS Dataset

Note: The data associated with action labels and 2D rendering will be uploaded soon.

Citation