The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Other
3.13k
stars
686
forks
source link
Why the output is 4 downsample compared with input image? #231
If the original image is (H, W), the output of this segmentation is (H/4, W/4). If I want to obtain the segmentation result of (H, W), I need to upsampling the output? Am I right? I think the upsampling would provide a coarse result.
Why don't make the output of the network to be (H, W)? For example, in the last layer network, add a convtranspose layer.
If the original image is (H, W), the output of this segmentation is (H/4, W/4). If I want to obtain the segmentation result of (H, W), I need to upsampling the output? Am I right? I think the upsampling would provide a coarse result.
Why don't make the output of the network to be (H, W)? For example, in the last layer network, add a convtranspose layer.