facebookresearch / VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint trajectories
Other
3.72k stars 753 forks source link

Training pretrained_h36m_detectron_coco.bin from scratch not work on custom video. #121

Open hundanLi opened 4 years ago

hundanLi commented 4 years ago

When I finish training the inference module pretrained_h36m_detectron_coco.bin using:

python run.py -e 40 -k detectron_pt_coco -arc 3,3,3

and then test on a custom .npz predicted by detectron2 following https://github.com/darkAlert/VideoPose3d_with_Detectron2 using:

python run.py -d custom -k myvideos -arc 3,3,3 -c checkpoint --evaluate epoch_40.bin --render --viz-subject detectron2 --viz-action custom --viz-camera 0 --viz-video ./data/pose2d-detectron2/input_video.mp4 --viz-output myvideos_output.mp4 --viz-size 6 --viz-no-ground-truth

it occures a size mismatch error while loading module:

Traceback (most recent call last):
  File "run.py", line 209, in <module>
    model_pos_train.load_state_dict(checkpoint['model_pos'])
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 830, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TemporalModelOptimized1f:
    size mismatch for expand_conv.weight: copying a param with shape torch.Size([1024, 51, 3]) from checkpoint, the shape in current model is torch.Size([1024, 34, 3]).

How can I address this problem, please?

hundanLi commented 4 years ago

I retrain it replacing the line with

kps = normalize_screen_coordinates(kps[..., :2], w=cam['res_w'], h=cam['res_h'])

and it works well. Thank you for your work!

Alex-JYJ commented 4 years ago

In the line https://github.com/facebookresearch/VideoPose3D/blob/bc7db2c378cac881d324bc2b3d4b2b404b0dc619/run.py#L178 poses_valid_2d[0].shape[-1] = 2, however the shape of the keypoints I get by detectron is [frames, 17, 3]. I has followed the author's suggestions to uncomment https://github.com/facebookresearch/VideoPose3D/blob/cf94b42e1edf144f8eff2da0f43be737225e43bd/data/data_utils.py#L81 , so I get keypoints whose shape is [frames, 17, 3]. Can I just use kps[..., :2] to get it work?