bmartacho / UniPose

We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual seg- mentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filter- ing in the cascade architecture, while maintaining multi- scale fields-of-view comparable to spatial pyramid config- urations. Additionally, our method is extended to UniPose- LSTM for multi-frame processing and achieves state-of-the- art results for temporal pose estimation in Video. Our re- sults on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of- the-art results in single person pose detection for both sin- gle images and videos.
Other
211 stars 44 forks source link

The results of validation are 0. #1

Closed YHDang closed 3 years ago

YHDang commented 3 years ago

Dear bmartacho: Hello. Thanks for your excellent research. I have met some questions about the source code. I ran the source code successfully on the Penn_Action dataset, but the results of the validation are 0. I don't know why the results occurred. So could you give me some suggestions, please? I changed the code as follows:

  1. When I ran on the Penn_Action dataset, I found that the heatmap size is 368368, but the output of the model is 4646. So I changed the heatmap size to 46. I don't know if is it right?
  2. In penn_action_data.py, I deleted the line 83 to 93.
  3. When I loaded the pre-trained model (UniPose_LSTM_PennAction.tar), the conv1.weights in resnet.py (in model/module/backbone) is [3, 64, 7, 7], but the pre-trained model has the shape [4, 64, 7, 7]. So I changed the shape of conv1.weights. Look forward to your reply, thanks a lot. The results are as follows: 微信截图_20201016173457
minhhoangbui commented 3 years ago

@YHDang Dear, have you succeeded to master this source code? I have several problems to understand the data loader and training progress. They seem inconsistent to the paper