Sirui-Xu / InterDiff

[ICCV 2023] Official PyTorch implementation of the paper "InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion"
https://sirui-xu.github.io/InterDiff
MIT License
229 stars 9 forks source link

How to get the extrinsics of the camera of BEHAVE dataset? #5

Closed Steve-Tod closed 11 months ago

Steve-Tod commented 11 months ago

Hi, thanks for making this great work public!

I'm checking the BEHAVE dataset and found their world coordinate to be the coordinate of camera 1 so the human pose is tilted. In your visualization, I can see that the floor is properly positioned and human poses are upright. Can you kindly let me know how you transform the poses of the BEHAVE dataset?

In your code here, it seems you use the pose of the first frame to compute the global orientation. But how to make sure that pose is upright?

Any help would be appreciated. Thanks in advance!

Sirui-Xu commented 11 months ago

You're welcome! Great questions.

You may notice that in our data processing, we did not use the camera parameters they provided, but directly used the SMPL parameters (the version at 30 fps) they processed. The SMPL parameters they provide do not have the pose tilt problem, as I recall.

For our processing in data_smpl.py, what we did is to canonicalize the global rotation around the z-axis so that the first frame of the human body fed to the network always faces the same horizontal direction. However, we just find this is invalid for very few subsequences, such as lying flat on a table, where the orientation on the horizontal cannot be normalized.

As for the floor, in our visualization, we use the lowest heights of the vertices of the human and objects in each subsequence as the ground height. This is a compromise for the fact that the dataset doesn't have ground information and the effect of mocap noise

Hope this answers your questions!

Steve-Tod commented 11 months ago

Thank you for the quick response! With your code, I found the pose is indeed upright. I'm trying to align BEHAVE with AMASS, and the joint positions from the SMPL model of AMASS make it tilted. I'll dig more into this.