yangsenius / TransPose

PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.
https://github.com/yangsenius/TransPose/releases/download/paper/transpose.pdf
MIT License
353 stars 56 forks source link

How to prepare target heatmaps? #29

Closed mukeshnarendran7 closed 2 years ago

mukeshnarendran7 commented 2 years ago

I want to use the pre-trained model and fine-tune it for another application but I am not able to find the heatmaps preparation code reference? Is it similar to taking an image an converting the (x,y) co-ordinates to heatmaps like for CNN's pose estimation problem? The model output is of (48,64) but my input images are 256, 192. A reference will be helpful. Thanks

yangsenius commented 2 years ago

Please refer to https://github.com/yangsenius/TransPose/blob/dab9007b6f61c9c8dce04d61669a04922bbcd148/lib/dataset/JointsDataset.py#L239

mukeshnarendran7 commented 2 years ago

Thanks for getting back

yangsenius commented 2 years ago

Hi,

  1. We use this code to transform the coordinate in (48, 64) into the original coordinate frame of (256, 192). This approach unavoidably brings quantization error, so we use DARK based post-processing to reduce such error.
  2. The final layer in the model is a 1x1 conv that convert the channel number from d to keypoint_number. This 1x1 conv equals a linear FC, because it is position-wise linear transformation. So you can also use a 1x1 conv with (d, 16) channels. They have the same effect to output the heatmaps.
  3. Just MSE loss to compute the error with GT heatmaps
mukeshnarendran7 commented 2 years ago

Hi, thanks once again for clarifying the issues. I have some more questions about processing.

yangsenius commented 2 years ago
  1. Yes, you're right.
  2. JointMSEloss considers more detailed implementation, such as the visibility of the keypoints. Essentially, they are the same loss function. But I suggest to use the JointMSE loss.