apple / ml-hugs

Official repository of HUGS: Human Gaussian Splats (CVPR 2024)
https://machinelearning.apple.com/research/hugs
Other
157 stars 14 forks source link

Inquiry About "4d_humans" Folder in NeuMan Dataset #6

Open TCN1214 opened 2 months ago

TCN1214 commented 2 months ago

Hi, thanks for releasing the code. When I downloaded your NeuMan dataset, I noticed an additional folder called "4d_humans," which isn't present in the original NeuMan dataset. Could you let me know where you obtained the files in the "4d_humans" folder, such as smpl_optimized_aligned_scale.npz, smpl_aligned_scale.npz, poses_optimized.npz, and poses.npz?

chuanHN commented 1 month ago

Have you made any progress? I have the same question here. @TCN1214

chuanHN commented 1 month ago

Anyone who is interested in motion capture can Joint wechat group through me. my wechat id is Denny0805789

ZCWzy commented 1 month ago

7.9 Edit So I found a workable way to generate smpl_optimised_aligned_scale.npz u need coding by urself to process But my training didn't turn out so well.

Using neuman's preprocessing process, you should get files other than smpl_optimised_aligned_scale.npz (sensepose,depthmaps,images,keypoints,monodepth,segments,smpl_pred ,sparse, etc.). 对于中国玩家,这个docker特别难搞,请多参考Neuman的issue,做好心理准备

In the preprocessing flow, change the function for solve_transformation in export_alignments.py to:

def solve_transformation(verts, j3d, j2d, plane_model, colmap_cap, smpl_cap):
    np.set_printoptions(suppress=True)
    mvp = np.matmul(smpl_cap.intrinsic_matrix, smpl_cap.extrinsic_matrix)
    trans = solve_translation(j3d, j2d, mvp)
    smpl_cap.cam_pose.camera_center_in_world -= trans[0]
    joints_world = (ray_utils.to_homogeneous(
        j3d) @ smpl_cap.cam_pose.world_to_camera.T @ colmap_cap.cam_pose.camera_to_world.T)[:, :3]
    scale = solve_scale(joints_world, colmap_cap, plane_model)

    transf = smpl_cap.cam_pose.world_to_camera.T * scale
    transf[3, 3] = 1
    transf = transf @ colmap_cap.cam_pose.camera_to_world_3x4.T

    rotation = smpl_cap.cam_pose.world_to_camera.T
    rotation[3, 3] = 1
    rotation = rotation @ colmap_cap.cam_pose.camera_to_world.T
    rotation = rotation[:3,:3]

    verts_world = ray_utils.to_homogeneous(verts) @ transf
    return transf, verts_world,rotation

Also remember to save the results of scale scale = solve_scale(joints_world, colmap_cap, plane_model)

Human pose prediction using PHALP as mentioned by 4D-humans. You can use the colab provided by 4D-humans. !python track.py video.source=(your images path) Note that the input is a jpg images, obtained from the folder images obtained in the previous step (need png->jpg) Referring to the PHALP repository, this step will get xxx.pkl With pkl, you get bbox, body_pose, betas, global_orient (all need to be processed)

Take the last line of alignments.npy as transl Take the saved scale from above as scale The bbox needs simple processing body_pose needs to be converted from rotation matrix to axis angle Regarding betas, I simply took the average. For global_orient, refer to the rotation saved above:(Here the ROMP's predictions are actually used to align)

pose = []
file = joblib.load("smpl_output_romp.pkl")
file = file[1]['pose']
for it in file:
    go = [it[0], it[1], it[2]]
    pose.append(go)

global_orient = []
for go, rotation in zip(pose, rotations):
    go = angle_axis_to_quaternion(torch.tensor(go))  # 轴角转四元数
    go = quaternion_to_rotation_matrix(go)  # 四元数转旋转矩阵
    go = np.array(go)

    tmp = rotation.T @ go
    tmp = torch.tensor(tmp)
    tmp = rotation_matrix_to_angle_axis(tmp)
    tmp = np.array(tmp)

    global_orient.append(tmp)

then u will get smpl_optimised_aligned_scale.npz

yashgarg98 commented 3 weeks ago

@ZCWzy were you able to resolve this issue and train humans on your own data?

ZCWzy commented 3 weeks ago

您是否能够解决此问题并使用自己的数据来训练人类?

@yashgarg98 no, There is still a problem with the global_orient parameter.

TCN1214 commented 3 weeks ago

Hi, @ZCWzy, you mentioned that the body_pose in smpl_optimized_aligned_scale.npz comes from HMR_results (4D-humans), right? But when I print them out, they appear in different formats. For example, for per frame,

The body_pose in 4D humans: Rotation Matrix(23x3x3), [[[[ 9.9234e-01, 8.2207e-02, -9.2175e-02], [-7.3918e-02, 9.9319e-01, 9.0004e-02], [ 9.8947e-02, -8.2502e-02, 9.9167e-01]], ........... [[ 9.9955e-01, 9.1900e-03, 2.8637e-02], [-1.2969e-02, 9.9080e-01, 1.3470e-01], [-2.7136e-02, -1.3501e-01, 9.9047e-01]]]]

The body_pose in smpl_optimized_aligned_scale.npz : [[-1.47791445e-01 4.16232273e-02 5.25643043e-02 -1.22563608e-01 2.61688922e-02 6.45624648e-04 2.20337659e-01 2.87827626e-02 2.13257540e-02 8.55209112e-01 6.79941997e-02 -4.13512923e-02 8.58351827e-01 1.72289640e-01 5.71232103e-02 6.53865710e-02 -3.40895471e-03 1.18748546e-02 -4.41254586e-01 3.55301857e-01 7.91222006e-02 -3.93874198e-01 -1.79850206e-01 6.99728727e-02 8.66328552e-02 1.39882900e-02 -1.46647738e-02 -1.45833284e-01 1.92387864e-01 2.00581640e-01 -1.08115859e-01 1.29345153e-02 -1.83795810e-01 5.12605272e-02 5.69591857e-03 1.39512479e-01 3.74578275e-02 -4.10997570e-02 -2.82684803e-01 -1.65722191e-01 2.81276673e-01 7.73936436e-02 2.02245563e-01 -2.85247006e-02 -1.13176286e-01 2.31526673e-01 -9.19183530e-03 -1.00814617e+00 -9.39681306e-02 6.85916603e-01 5.57925045e-01 3.69742483e-01 -6.59028530e-01 -3.24325003e-02 -1.59564856e-02 1.36007953e+00 -4.22677845e-01 -1.38628498e-01 -2.11106837e-02 4.08339612e-02 5.21880051e-04 8.00568163e-02 -2.06193715e-01 -3.05936605e-01 -3.30521166e-02 -1.99945956e-01 -1.81736186e-01 1.07510857e-01 1.99526295e-01]]

So, do you use any methods to convert them?

ZCWzy commented 3 weeks ago

@TCN1214 The body_pose in 4D-humans is a rotation matrix representation and the body_pose in smpl_optimized_aligned_scale.npz is an axis angle representation (Axis with angle magnitude). A similar question was asked in the 4D-humans issue. conversion from rotation matrix to axis angle The converted body_pose seems to be "inaccurate", I checked it with a visualization tool and I think it's correct

TCN1214 commented 3 weeks ago

@ZCWzy, got it, thanks! And may I know if you also get the global_orient and betas from 4D Human? I noticed that in smpl_optimized_aligned_scale.npz , the betas are the same for all frames. Do you have any idea why?

ZCWzy commented 2 weeks ago

@ZCWzy, got it, thanks! And may I know if you also get the global_orient and betas from 4D Human? I noticed that in smpl_optimized_aligned_scale.npz , the betas are the same for all frames. Do you have any idea why?

4d-Humans tracking is demonstrated with the help of PHALP. The bbox, global_orient, body_pose and betas can be obtained directly using PHALP. But there's a problem with global_orient in this case, it doesn't work directly.

beta is used to describe the shape of the human body, and the authors may have assumed that the shape of the subject's body would not change

TCN1214 commented 2 weeks ago

4d-Humans tracking is demonstrated with the help of PHALP. The bbox, global_orient, body_pose and betas can be obtained directly using PHALP. But there's a problem with global_orient in this case, it doesn't work directly.

beta is used to describe the shape of the human body, and the authors may have assumed that the shape of the subject's body would not change

The betas produced from 4D human data are different for each frame. May I know how I can get the same betas for all frames?

ZCWzy commented 2 weeks ago

4d-Humans tracking is demonstrated with the help of PHALP. The bbox, global_orient, body_pose and betas can be obtained directly using PHALP. But there's a problem with global_orient in this case, it doesn't work directly. beta is used to describe the shape of the human body, and the authors may have assumed that the shape of the subject's body would not change

The betas produced from 4D human data are different for each frame. May I know how I can get the same betas for all frames?

I simply averaged

TCN1214 commented 2 weeks ago

I simply averaged

Alright, I got it. Thanks for your answer!

TCN1214 commented 2 weeks ago

Hi, @ZCWzy, you mentioned that scale and translation come from alignments.npy right? But when I compare it, I noticed that the translation values in smpl_optimized_aligned_scale.npz are different from alignment.npy, the translation values I extracted from smpl_optimized_aligned_scale.npz is quite large, [[-10.122043 1.2542624 7.5895085 ]] In contrast, the translation values I obtained from alignment.npy is much smaller, [[-0.0011, 0.0050, -0.0378]].

And also, the scale I obtained from alignment.npy is also different from the smpl_optimized_aligned_scale.npz, Example The scale I directly extracted from smpl_optimized_aligned_scale.npz (parkinglot from Neuman dataset) [3.965441 3.9393816 3.935972 3.932178 3.9514518 4.013123 3.6961203 3.6949155 3.6817148 3.69526 3.67739 3.6596239 3.602954 3.596763 3.601293 3.538034 3.6467068 3.653771 3.6413672 3.6631591 3.6483827 3.6039793 3.7009943 3.6978316 3.6661446 3.7132351 3.7439525 3.7223043 3.7446861 3.7167091 3.718121 3.7254384 3.7595584 3.698107 3.6785161 3.7461946 3.6378756 3.7673268 3.9743676 3.9388347 3.9378428 3.840855 ]

The scale I obtained from alignment.npy (parkinglot from Neuman dataset) [3.53917746 3.47164602 3.43697076 3.45280097 3.46665239 3.54570767 3.5541983 3.49975832 3.46148575 3.50300569 3.53204423 3.53689076 3.61704864 3.57576852 3.55589865 3.5947892 3.70504546 3.57626152 3.58610565 3.57432691 3.59437816 3.56360463 3.63913091 3.60963319 3.60541642 3.6163094 3.55629385 3.60355709 3.62477594 3.63914295 3.58286411 3.63471325 3.62164655 3.63289107 3.59772363 3.6078752 3.567438 3.61177402 3.72219769 3.58786174 3.69255453 3.65207427]

So could you tell me more details about how you get the scales and translation? Do you just run alignment.npy or do you add some modifications?

ZCWzy commented 2 weeks ago

@TCN1214 image image I think transl is getting it from here. ~I verified it using blender.(Transform the original vertices using the scale and transl I calculated, it is only a rotation away from the vertices in the world coordinate system)~ This transl is not quite right, I should separate rotation, translation and scale from that transf matrix. image

scale has a more accurate way of fetching(from neuman/export_alignment)Numerical errors are quite normal. Every time you run export_alignment.py the scale is different image

TCN1214 commented 2 weeks ago

Alright, thank you for your help!

TCN1214 commented 2 weeks ago

I think transl is getting it from here. ~I verified it using blender.(Transform the original vertices using the scale and transl I calculated, it is only a rotation away from the vertices in the world coordinate system)~ This transl is not quite right, I should separate rotation, translation and scale from that transf matrix.

@ZCWzy May I know how you separate them?

ZCWzy commented 2 weeks ago

I think transl is getting it from here. ~I verified it using blender.(Transform the original vertices using the scale and transl I calculated, it is only a rotation away from the vertices in the world coordinate system)~ This transl is not quite right, I should separate rotation, translation and scale from that transf matrix.

@ZCWzy May I know how you separate them?

I'm not sure because I don't multi view geometry. You should refer to neuman/export_alignment.py, as well as the relevant papers to be sure

TCN1214 commented 2 weeks ago

I'm not sure because I don't multi view geometry. You should refer to neuman/export_alignment.py, as well as the relevant papers to be sure

Alright, I'll look into them. Thank you for your answer!

TCN1214 commented 2 weeks ago

for vert, vertInWorld, scale, translation, go in zip(verts, vertsInWorld, scales, translations, pose): vertWithoutRotation = vert*scale +translation

    rotation = open3d_registration(vertWithoutRotation, vertInWorld)

    # 计算旋转矩阵
    rotation = rotation[:3, :3]

    go = angle_axis_to_quaternion(torch.tensor(go)) #轴角转四元数
    go = quaternion_to_rotation_matrix(go) #四元数转旋转矩阵
    go=np.array(go)
    tmp = rotation @ go
    tmp=torch.tensor(tmp)
    tmp=rotation_matrix_to_angle_axis(tmp)
    tmp=np.array(tmp)

    global_orient.append(tmp)

@ZCWzy may I know what is your structure of the open3d_registration function, and what is it used for?

ZCWzy commented 2 weeks ago

for vert, vertInWorld, scale, translation, go in zip(verts, vertsInWorld, scales, translations, pose): vertWithoutRotation = vert*scale +translation

    rotation = open3d_registration(vertWithoutRotation, vertInWorld)

    # 计算旋转矩阵
    rotation = rotation[:3, :3]

    go = angle_axis_to_quaternion(torch.tensor(go)) #轴角转四元数
    go = quaternion_to_rotation_matrix(go) #四元数转旋转矩阵
    go=np.array(go)
    tmp = rotation @ go
    tmp=torch.tensor(tmp)
    tmp=rotation_matrix_to_angle_axis(tmp)
    tmp=np.array(tmp)

    global_orient.append(tmp)

@ZCWzy may I know what is your structure of the open3d_registration function, and what is it used for?

open3d_registration is a method of the open source library open3d for the purpose of point cloud alignment. Here I want to move the original verts (which is a point cloud of 6890 points) to verts_world by scaling and translating it, so that the two point clouds are only rotated. In this way the global_orient is corrected to the correct position. But there seems to be a problem with this line of thinking, ~because my translations are not accurate enough~ I entered it wrong, the transl is quite accurate. Why is the rotation wrong?

TCN1214 commented 1 week ago

open3d_registration is a method of the open source library open3d for the purpose of point cloud alignment. Here I want to move the original verts (which is a point cloud of 6890 points) to verts_world by scaling and translating it, so that the two point clouds are only rotated. In this way the global_orient is corrected to the correct position. But there seems to be a problem with this line of thinking, ~because my translations are not accurate enough~ I entered it wrong, the transl is quite accurate. Why is the rotation wrong?

@ZCWzy Sorry, I'm not sure yet.... Now I'm trying your method to generate the global_orient. May I know where I can refer to the code of open3d_registration function?

ZCWzy commented 1 week ago

@TCN1214 do not use that way to create global_orient, it may not accurent.

use this:

def solve_transformation(verts, j3d, j2d, plane_model, colmap_cap, smpl_cap):
    np.set_printoptions(suppress=True)
    mvp = np.matmul(smpl_cap.intrinsic_matrix, smpl_cap.extrinsic_matrix)
    trans = solve_translation(j3d, j2d, mvp)
    smpl_cap.cam_pose.camera_center_in_world -= trans[0]
    joints_world = (ray_utils.to_homogeneous(
        j3d) @ smpl_cap.cam_pose.world_to_camera.T @ colmap_cap.cam_pose.camera_to_world.T)[:, :3]
    scale = solve_scale(joints_world, colmap_cap, plane_model)

    transf = smpl_cap.cam_pose.world_to_camera.T * scale
    transf[3, 3] = 1
    transf = transf @ colmap_cap.cam_pose.camera_to_world_3x4.T

    rotation = smpl_cap.cam_pose.world_to_camera.T
    rotation[3, 3] = 1
    rotation = rotation @ colmap_cap.cam_pose.camera_to_world.T
    rotation = rotation[:3,:3]

    verts_world = ray_utils.to_homogeneous(verts) @ transf
    return transf, verts_world,rotation

then

    pose = []
    file = joblib.load("smpl_output_romp.pkl")
    file = file[1]['pose']
    for it in file:
        go = [it[0], it[1], it[2]]
        pose.append(go)

    global_orient = []
    for go, rotation in zip(pose, rotations):
        go = angle_axis_to_quaternion(torch.tensor(go))  # 轴角转四元数
        go = quaternion_to_rotation_matrix(go)  # 四元数转旋转矩阵
        go = np.array(go)

        tmp = rotation.T @ go
        tmp = torch.tensor(tmp)
        tmp = rotation_matrix_to_angle_axis(tmp)
        tmp = np.array(tmp)

        global_orient.append(tmp)

That seems about right. image But it would make the assert x.max() <= 1 + EPS and x.min() >= -EPS, f "x must be in [0, 1], got {x.min()} and {x.max()}" (from /hugs/models/models/triplane.py) The restriction is not satisfied. Why?

TCN1214 commented 1 week ago

@ZCWzy You mean you have an error, right? Can I have a look at your error?

TCN1214 commented 1 week ago

But it would make the assert x.max() <= 1 + EPS and x.min() >= -EPS, f "x must be in [0, 1], got {x.min()} and {x.max()}" (from /hugs/models/models/triplane.py) The restriction is not satisfied. Why?

@ZCWzy I have encountered this error before. The reason I didn't satisfy the condition assert x.max() <= 1 + EPS and x.min() >= -EPS, f "x must be in [0, 1], got {x.min()} and {x.max()}" was because I used the wrong scale and translation. When I used the translation and scale values from alignments.npy, the problem was solved. However, I don't think the reason you're getting this error is the same as mine, because you used the translation and scale values from alignments.npy, right?

ZCWzy commented 1 week ago

But it would make the assert x.max() <= 1 + EPS and x.min() >= -EPS, f "x must be in [0, 1], got {x.min()} and {x.max()}" (from /hugs/models/models/triplane.py) The restriction is not satisfied. Why?

@ZCWzy I have encountered this error before. The reason I didn't satisfy the condition assert x.max() <= 1 + EPS and x.min() >= -EPS, f "x must be in [0, 1], got {x.min()} and {x.max()}" was because I used the wrong scale and translation. When I used the translation and scale values from alignments.npy, the problem was solved. However, I don't think the reason you're getting this error is the same as mine, because you used the translation and scale values from alignments.npy, right?

I remove this assert then train. It looks like there's no problem.

TCN1214 commented 1 week ago

You can use the colab provided by 4D-humans. !python track.py video.source=(your images path)

@ZCWzy When I run the command !python track.py video.source=(your images path) in the Colab provided by 4D Humans, I get this error:

[12/03 09:12:56] INFO No OpenGL_accelerate module loaded: No module

When I downloaded OpenGL_accelerate, I get this error below:

[07/09 10:29:53] INFO OpenGL_accelerate module loaded

This is why?

ZCWzy commented 1 week ago

You can use the colab provided by 4D-humans. !python track.py video.source=(your images path)

@ZCWzy When I run the command !python track.py video.source=(your images path) in the Colab provided by 4D Humans, I get this error:

[12/03 09:12:56] INFO No OpenGL_accelerate module loaded: No module

When I downloaded OpenGL_accelerate, I get this error below:

[07/09 10:29:53] INFO OpenGL_accelerate module loaded

This is why?

put basicModel_neutral_lbs_10_207_0_v1.0.0.pkl in /4D-humans/data and /4D-humans from this issue

TCN1214 commented 1 week ago

put basicModel_neutral_lbs_10_207_0_v1.0.0.pkl in /4D-humans/data and /4D-humans from this issue

@ZCWzy Yeah, I put the model already, but It still has this error (INFO No OpenGL_accelerate module loaded: No module), so I think the problem is not the SMPL model. Are you able to run the colab provided by 4D humans?

ZCWzy commented 1 week ago

put basicModel_neutral_lbs_10_207_0_v1.0.0.pkl in /4D-humans/data and /4D-humans from this issue

@ZCWzy Yeah, I put the model already, but It still has this error (INFO No OpenGL_accelerate module loaded: No module), so I think the problem is not the SMPL model. Are you able to run the colab provided by 4D humans?

https://colab.research.google.com/drive/11XZBoaMfa28y874r4B7VSVokNFlXiqqz?usp=sharing After running track, I see "INFO OpenGL_accelerate module loaded", so I need to wait about 6-10 minutes to get the pkl. You can tell if it's running by looking at the colab resource usage.

yashgarg98 commented 4 days ago

Hey, @ZCWzy after the alignment, were you able to fix the global coordinate issue? Also are you able to generate the human gaussian splat using HUGS?

Daydreamer-f commented 4 days ago

Hi @ZCWzy, I am currently encountering some difficulties while preprocessing my dataset, specifically with the 4dhuman folder. I noticed that you successfully processed your data. Could you please share your preprocessing code? I would also appreciate the opportunity to discuss further with you privately, as I believe we might have similar research interests. Here is my email (ysfang0306@gmail.com), feel free to reach out! Thank you very much!

ZCWzy commented 3 days ago

Hey, @ZCWzy after the alignment, were you able to fix the global coordinate issue? Also are you able to generate the human gaussian splat using HUGS?

Yeah, but my rebuild didn't work out so well.

ZCWzy commented 3 days ago

Hi @ZCWzy, I am currently encountering some difficulties while preprocessing my dataset, specifically with the 4dhuman folder. I noticed that you successfully processed your data. Could you please share your preprocessing code? I would also appreciate the opportunity to discuss further with you privately, as I believe we might have similar research interests. Here is my email (ysfang0306@gmail.com), feel free to reach out! Thank you very much!

这位北大的✌好,请参照我之前的回复产生自定义数据。bbox,bodypose,betas都很容易利用4d-humans生成。global_orient,scale和transl需要轻微魔改一下neuman的预处理代码。这个数据预处理和转换不难写的 实际上我想写一个自己的预处理过程,使用4d-human来替代romp,但是我不能把4d-human,或者phalp产生的j2d投影到输入的images上,因此不能准确产生后面三个参数