ViTAE-Transformer / ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Apache License 2.0
1.37k stars 186 forks source link

Different Attention types and bottom-up configuration #105

Open Legilas opened 1 year ago

Legilas commented 1 year ago

Hello! I'm working on a master thesis about bottom-up pose estimation on high resolution images. Your paper seems to address both of these topics successfully, yet I am unable to find a configuration for the bottom-up approach presented in the paper nor the 2 modifications to standard Attention (Shift Window and Pooling Window) to tackle higher resolution feature maps. Am I overlooking something in the repository? Or are these parts of the paper not part of this implementation and if so, are there any plans to release them later? Or some other place to find them?

Ideal-H commented 10 months ago

Is there any reply for this issue?