JanMarcelKezmann / TensorFlow-Advanced-Segmentation-Models

A Python Library for High-Level Semantic Segmentation Models based on TensorFlow and Keras with pretrained backbones.
Other
151 stars 48 forks source link

keypoint detection #18

Closed soans1994 closed 2 years ago

soans1994 commented 2 years ago

hello author,

can the hrnet model be used to detect keypoints by heatmap regression?

JanMarcelKezmann commented 2 years ago

Hi @soans1994 ,

that is a very good question.

I have actually never worked on a keypoint detection model, that is why I can only guess the answer. From my understanding of using HRNet for Semantic Segmentation and my only little understanding of creating heatmaps, I would say yes, you can create those with the HRNet mode, but the HRNet itself should only be used as a backbone for the heatmap generation.

I would recommend you to look at this repo with the associated paper. That explains how you can leverage the HRNet model structure as a backbone for creating High-Resolution Heatmaps.

I hope that I could help you with this answer.

soans1994 commented 2 years ago

@JanMarcelKezmann thank you for sending the repo link and paper. I want to ask in your implementation whether the output is upsampled back to original input size while prediction?

JanMarcelKezmann commented 2 years ago

Yes, it is. That is because the "True" output (which we compare it to for loss calculation) has the same size as our input.

In theory you could downsample the true output as well as the predicted output and calculate the losses on those. But this would need a manual adaptation of my code here (lines 67-70 in models/HRNetOCR.py):

     # Upsampling and Concatentation of stages
     self.upsample_x2 = tf.keras.layers.UpSampling2D(size=2, interpolation="bilinear")
     self.upsample_x4 = tf.keras.layers.UpSampling2D(size=4, interpolation="bilinear")
     self.upsample_x8 = tf.keras.layers.UpSampling2D(size=8, interpolation="bilinear")

What you can do easily (at least I think so) is to upsample only by half the factors I have used, i.e. (1, 2, 4) instead of (2, 4, 8). The same thing can be done with stronger upsampling using higher factors.

Another option could be to simply up- or downsample the final output of the model, by using some kind of interpolation (e.g. cubic) this could make it even easier and is probably more flexible if you want to output a specific size.

soans1994 commented 2 years ago

Thank you very much

JanMarcelKezmann commented 2 years ago

You're welcome.