bmartacho / UniPose

We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual seg- mentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filter- ing in the cascade architecture, while maintaining multi- scale fields-of-view comparable to spatial pyramid config- urations. Additionally, our method is extended to UniPose- LSTM for multi-frame processing and achieves state-of-the- art results for temporal pose estimation in Video. Our re- sults on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of- the-art results in single person pose detection for both sin- gle images and videos.
Other
211 stars 44 forks source link

Running Error: uniposeLSTM on Penn_Action #9

Closed smilejames closed 3 years ago

smilejames commented 3 years ago

error in code below (model/uniposeLSTM.py): if iter == 0: x = torch.cat((input[:,iter,:,:,:], centermap[:,iter,:,:,:]), dim=1) x, low_level_feat = self.backbone(x)

print shape below: input shape: torch.Size([5, 5, 3, 368, 368]) centermap: torch.Size([5, 5, 1, 368, 368]) x : torch.Size([5, 4, 368, 368])

Traceback (most recent call last): File "uniposeLSTM.py", line 301, in trainer.training(epoch) File "uniposeLSTM.py", line 126, in training heat, cell, hide = self.model(input_var, centermap_var, j, heat, hide, cell) File "/Users/james/.pyenv/versions/3.8.3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/Users/james/work/PySpace/UniPose/model/uniposeLSTM.py", line 109, in forward x, low_level_feat = self.backbone(x) File "/Users/james/.pyenv/versions/3.8.3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/Users/james/work/PySpace/UniPose/model/modules/backbone/resnet.py", line 114, in forward x = self.conv1(input) File "/Users/james/.pyenv/versions/3.8.3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/Users/james/.pyenv/versions/3.8.3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward return self._conv_forward(input, self.weight) File "/Users/james/.pyenv/versions/3.8.3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward return F.conv2d(input, weight, self.bias, self.stride, RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[5, 4, 368, 368] to have 3 channels, but got 4 channels instead

bmartacho commented 3 years ago

Thank you for pointing out. The code was updated to reflect the final version after tests in the development.