HRNet / HRNet-Semantic-Segmentation

The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Other
3.09k stars 682 forks source link

Bug fix. Change softmax dim. #271

Open yannqi opened 1 year ago

yannqi commented 1 year ago

In line 64, Change the softmax dim from 2 to 1.
According to this line, probs = F.softmax(self.scale * probs, dim=2)# batch x k x hw

In this code, the input dimension is [batch_size, num_class, fh*fw]. And the softmax dimension is 2, which means that the summation of the dimensions of the feature map (fh*fw) is one.

However, in my opinion, I thinke the softmax dimension should be 1 to make the summation of the dimension of the num_class (num_class) is one.

The corrected code is as follows: probs = F.softmax(self.scale * probs, dim=1)# batch x num_class x hw

By the way, I had report this to issue, but without answer. And I have a simple comparative experimental verification, the results show that dim1 can convergence faster, and get a better mIOU.