Jeff-sjtu / HybrIK

Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021
MIT License
1.21k stars 147 forks source link

Problem occurs when running the evaluation process on 3DPW #197

Closed avegetablechicken closed 1 year ago

avegetablechicken commented 1 year ago

I run the evaluation as instructed by ./scripts/validate_smpl_cam.sh ./configs/256x192_adam_lr1e-3-hrw48_cam_2x_w_pw3d_3dhp.yaml ./pretrained_hrnet.pth. I use the pretrained model "hybrik_hrnet48_w3dpw.pth" as "pretrained_hrnet.pth" like running the demo (and it worked successfully). The evaluation script works well on Human3.6M, but it fails on 3DPW. Here is the log:

Namespace(cfg='./configs/256x192_adam_lr1e-3-hrw48_cam_2x_w_pw3d_3dhp.yaml', checkpoint='./pretrained_models/hybrik_hrnet.pth', gpus='0', batch=32, flip_test=True, flip_shift=False, rank=0, dist_url='tcp://127.0.1.1:23457', dist_backend='nccl', launcher='pytorch', world_size=1)
tcp://127.0.1.1:23457, ws:1, rank:0
Loading model from ./pretrained_models/hybrik_hrnet.pth...
##### Testing on 3DPW #####
  0%|                                                                                                                                                                                                                        | 0/1110 [00:46<?, ?it/s]
Traceback (most recent call last):
  File "/home/xuyiwen/HybrIK/./scripts/validate_smpl_cam.py", line 229, in <module>
    main()
  File "/home/xuyiwen/HybrIK/./scripts/validate_smpl_cam.py", line 168, in main
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(opt, cfg))
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/xuyiwen/HybrIK/scripts/validate_smpl_cam.py", line 216, in main_worker
    gt_tot_err = validate_gt(m, opt, cfg, gt_val_dataset_3dpw, heatmap_to_coord, opt.batch, test_vertice=True)
  File "/home/xuyiwen/HybrIK/scripts/validate_smpl_cam.py", line 94, in validate_gt
    gt_output = m.module.forward_gt_theta(gt_thetas, gt_betas)
  File "/home/xuyiwen/HybrIK/hybrik/models/HRNetWithCam.py", line 475, in forward_gt_theta
    output = self.smpl(
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xuyiwen/HybrIK/hybrik/models/layers/smpl/SMPL.py", line 202, in forward
    vertices, joints, rot_mats, joints_from_verts_h36m = lbs(betas, full_pose, self.v_template,
  File "/home/xuyiwen/HybrIK/hybrik/models/layers/smpl/lbs.py", line 258, in lbs
    pose_offsets = torch.matmul(pose_feature, posedirs) \
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x639 and 207x20670)

@Jeff-sjtu @biansy000 Could you please help me out?

biansy000 commented 1 year ago

It is quite strange, and I have never met the problem. I think the problem is due to the incorrect dimension of gt_thetas in https://github.com/Jeff-sjtu/HybrIK/blob/9b8681dcf3c902dd5dacc01520ba04982990e1e2/scripts/validate_smpl_cam.py#L94

Can you output the dimension of gt_thetas when validating on Human3.6M and 3DPW ? Or as an easy alternative, you may directly set test_vertice=False to disable the calculation of PVE.

avegetablechicken commented 1 year ago

It is quite strange, and I have never met the problem. I think the problem is due to the incorrect dimension of gt_thetas in

https://github.com/Jeff-sjtu/HybrIK/blob/9b8681dcf3c902dd5dacc01520ba04982990e1e2/scripts/validate_smpl_cam.py#L94

Can you output the dimension of gt_thetas when validating on Human3.6M and 3DPW ? Or as an easy alternative, you may directly set test_vertice=False to disable the calculation of PVE.

Hey, thanks for your reply. It's (32, 216). And it works well when disabling the calculation of PVE.

biansy000 commented 1 year ago

I think you may try reshaping gt_thetas to be (32, 24, 9).

avegetablechicken commented 1 year ago

I think you may try reshaping gt_thetas to be (32, 24, 9).

It makes no difference. In "lbs.py" it seems that the pose is finnaly reshaped by rot_mats = batch_rodrigues( pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3])

avegetablechicken commented 1 year ago

I think you may try reshaping gt_thetas to be (32, 24, 9).

It makes no difference. In "lbs.py" it seems that the pose is finnaly reshaped by rot_mats = batch_rodrigues( pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3])

Inspired by your advice, I assume that gt_thetas is already in the form of rotation matrix. I comment the transformation code and just make a view of gt_thetas as rotation_matrix by rot_mats = pose.view(batch_size, -1, 3, 3). It works well and gives plausible evaluation results on 3DPW close to the numbers in the paper.

biansy000 commented 1 year ago

Yes, that is the correct solution.

avegetablechicken commented 1 year ago

Yes, that is the correct solution.

BTW, do you have any plan to share the pre-processed datasets or code for pre-process for D&D project?

avegetablechicken commented 1 year ago

Yes, that is the correct solution.

Or could you please tell me the meaning of dict key of pre-processed data, such as the different of pose and thetas, shape and betas? I am very interested in "D&D".

biansy000 commented 10 months ago

@avegetablechicken Hi, sorry for late reply. pose and thetas, shape and betas have exact the same meaning.

OliverSTH commented 3 months ago

Yes, that is the correct solution.

Or could you please tell me the meaning of dict key of pre-processed data, such as the different of pose and thetas, shape and betas? I am very interested in "D&D".

@avegetablechicken Hello Are you using a single GPU for evaluation? If this is the case, can I ask you if there is anything that needs to be modified in the script? thanks for your reply