HRNet / HRNet-Semantic-Segmentation

The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Other
3.16k stars 690 forks source link

The effect of bilinear upsampling for the final segmentation stage #138

Open daifeng2016 opened 4 years ago

daifeng2016 commented 4 years ago

Hi, in your HRNet, the prediction size of the output segmentation map is 1/4 of the raw image, then bilinear upsampling is adopted to generate the final segmentaiton map. I am wondering why not generate the output map same size as the raw image, since upsampling operation may bring many spatial errors. Is it the GPU memory issue?

sunke123 commented 4 years ago

You are right. If operating the convs on the features with the original size, not only the GPU memory cost but also the computation complexity are very high. We have not tried it.