bmartacho / UniPose

We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN architectures heavily rely on statistical postprocessing or predefined anchor poses for joint localization. UniPose incorporates contextual seg- mentation and joint localization to estimate the human pose in a single stage, with high accuracy, without relying on statistical postprocessing methods. The Waterfall module in UniPose leverages the efficiency of progressive filter- ing in the cascade architecture, while maintaining multi- scale fields-of-view comparable to spatial pyramid config- urations. Additionally, our method is extended to UniPose- LSTM for multi-frame processing and achieves state-of-the- art results for temporal pose estimation in Video. Our re- sults on multiple datasets demonstrate that UniPose, with a ResNet backbone and Waterfall module, is a robust and efficient architecture for pose estimation obtaining state-of- the-art results in single person pose detection for both sin- gle images and videos.
Other
211 stars 44 forks source link

The structure of wasp. #20

Closed YuQi9797 closed 3 years ago

YuQi9797 commented 3 years ago

image

Sir, I saw the picture the input is fed to AtrousModule where the dilation_rate = 6 firstly, but in your code the input is fed to AtrousModule that dilation_rate = 24 firstly.

image

YuQi9797 commented 3 years ago

What's the difference of this, in the module?

bmartacho commented 3 years ago

Please refer to the following regarding testing and experimentation of the dilation rates:

https://github.com/bmartacho/WASP/issues/1#issuecomment-563269567