Closed wangdong0556 closed 3 years ago
Thanks for your interest!
This repo is based on DarkPose. You only need to change some configs of experiment Yaml files for MPII. See darkpose for reference.
I adjust the number of heads according to the dimension of query key vectors, in order to keep the dimension in each head not very large.
For ResNet-S based, d=256, then n_heads = 8 = 256 // 32
. For HRNet-S based, d=96, then n_heads = 1 = 96 // 96
.
Also, for TransPose-H, using fewer heads is to consume less GPU memory, because we conduct self-attention on 1/4 input resolution.
“For ResNet-S based, d=256, then n_heads = 8 = 256 // 32. For HRNet-S based, d=96, then n_heads = 1 = 96 // 96.” The values of RESNET and HRNet are 32 and 96 respectively. What are the meanings of these values(32 and 96)? Are they 96 for hrnet-s-w32 and w48?
Actually, they have no special meanings. 64 for HRNet-W32, 96 for HRNet-w48
The output feature map channel for ResNet is 512, so we set the d_model to be 256; The output feature map channel for HRNet is 32 or 48, so we set the d_model to be 64 or 96.
Thanks!Why is the d_model set to one-half of the output feature map channel, is it a fixed setting or is there some other reason?
There is no special reasons for halving or doubling the channels. We want the channel transformation to be of the same order of magnitude.
Hello , I am very happy to see such excellent work. How can I use this project to train and test in MPII? I adjusted some parameters, but the result is not very good. Have you done any work on this data set?
In addition, what is the basis for determining the number of heads for different models?
Thank you!