The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Apache License 2.0
1.37k
stars
186
forks
source link
Different Attention types and bottom-up configuration #105
Hello!
I'm working on a master thesis about bottom-up pose estimation on high resolution images. Your paper seems to address both of these topics successfully, yet I am unable to find a configuration for the bottom-up approach presented in the paper nor the 2 modifications to standard Attention (Shift Window and Pooling Window) to tackle higher resolution feature maps. Am I overlooking something in the repository? Or are these parts of the paper not part of this implementation and if so, are there any plans to release them later? Or some other place to find them?
Hello! I'm working on a master thesis about bottom-up pose estimation on high resolution images. Your paper seems to address both of these topics successfully, yet I am unable to find a configuration for the bottom-up approach presented in the paper nor the 2 modifications to standard Attention (Shift Window and Pooling Window) to tackle higher resolution feature maps. Am I overlooking something in the repository? Or are these parts of the paper not part of this implementation and if so, are there any plans to release them later? Or some other place to find them?