Closed Berry-Wu closed 1 year ago
Hi, thanks for your interest in our work. Actually, we still have the residual layer, as you can see from here. So, we don't need to talk about it in the paper.
Sorry, I missed your reply. It was my fault, thank you for your reply!:)
Hi, I have a question about the code in tokenpose_base.py of Transformer: why remove the first residual of Attention. I see the code in TokenPose dosen't remove it. And in your paper ,i can't find the reason.