Shimingyi / MotioNet

A deep neural network that directly reconstructs the motion of a 3D human skeleton from monocular video [ToG 2020]
https://rubbly.cn/publications/motioNet/
BSD 2-Clause "Simplified" License
563 stars 84 forks source link

kernel size #36

Closed yul85 closed 3 years ago

yul85 commented 3 years ago

Dear authors,

Hello! thank you for providing the awesome codes.

I tested your demo code on the videos provided by the other work (SfV: Reinforcement Learning of Physical Skills from Videos [Peng et al. 2018]). Each video contains the human motion like jump, cartwheel, dance, and etc.

I ran your code on them but it failed to reconstruct the motion, instead got the Runtime error below..

Building the network x: tensor([[[-5.2118e-03, -1.9778e-03, 3.7148e-02, ..., 4.0354e-01, 7.9354e-01, 1.3558e+00], [ 5.9371e-01, 2.1293e+00, 3.0198e+00, ..., 1.3015e+00, -3.0760e-03, 5.9612e-02], [ 2.6908e+00, 2.8417e+00, 3.1971e+00, ..., 2.9587e+00, -8.6406e-03, -2.6097e-02], ..., [ 2.9316e+00, 4.8310e+00, 5.9596e+00, ..., 3.5114e+00, 6.2439e-01, 2.0394e+00], [-1.5403e-02, -6.0191e-02, -8.2277e-02, ..., -4.6780e-02, -2.5275e-02, -4.6015e-02], [-8.5922e-03, -2.5728e-03, 5.3182e-02, ..., -2.7516e-03, -9.3090e-03, -4.1515e-03]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 10]) x: tensor([[[-0.0239, -0.0236, -0.0197, ..., -0.0534, -0.0357, -0.0212], [-0.0038, -0.0016, -0.0057, ..., -0.0095, -0.0091, -0.0155], [-0.0023, -0.0016, -0.0032, ..., 0.6250, 0.5801, -0.0042], ..., [-0.0215, -0.0185, -0.0183, ..., -0.0356, -0.0176, -0.0103], [-0.0991, -0.1097, -0.0967, ..., -0.1064, -0.0717, -0.0246], [-0.0532, -0.0627, -0.0558, ..., -0.0655, -0.0536, -0.0239]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 8]) x: tensor([[[ 0.7402, 1.0614, 0.9979, ..., -0.0066, 0.5572, 1.4654], [-0.0823, -0.1001, -0.0753, ..., -0.0456, -0.0340, -0.0150], [-0.0556, -0.0696, -0.0560, ..., -0.0411, -0.0323, -0.0139], ..., [ 1.7153, 2.3598, 1.4705, ..., -0.0079, 0.0161, 0.3899], [-0.0161, -0.0133, -0.0115, ..., -0.0120, -0.0171, -0.0142], [-0.0651, -0.0784, -0.0599, ..., -0.0418, -0.0331, -0.0132]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 8]) x: tensor([[[-0.0262, -0.0155], [ 1.2542, 0.8397], [ 0.4866, 0.3434], ..., [ 0.3525, -0.0054], [-0.0251, -0.0131], [-0.0137, -0.0037]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 2]) Traceback (most recent call last): File "evaluate.py", line 144, in main(config, args, output_folder) File "evaluate.py", line 121, in main export(args.input) File "evaluate.py", line 101, in export pre_bones, pre_rotations, pre_rotations_full, pre_pose_3d, pre_c, pre_proj = model.forward_fk(poses_2d_root, parameters) File "/ssd_data/MotioNet/model/model.py", line 57, in forward_fk fake_bones = self.forward_S(_input) File "/ssd_data/MotioNet/model/model.py", line 40, in forward_S return self.branch_S(_input) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, kwargs) File "/ssd_data/MotioNet/model/model_zoo.py", line 302, in forward x = self.drop(self.relu(layer(x))) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 263, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 260, in _conv_forward self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size

Do you know what is causing this error?

Thanks!

Shimingyi commented 3 years ago

Hi @yul85 ,

What's the frame number of these video? I am considering if it's too short, so we need to use a setting with flag stage=1.

Best, Mingyi

yul85 commented 3 years ago

Hi Mingyi,

The "short" video was the problem! Thank you for your help.