XiaLiPKU / EMANet

The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)
https://xialipku.github.io/publication/expectation-maximization-attention-networks-for-semantic-segmentation/
GNU General Public License v3.0
681 stars 131 forks source link

The difference between the code and the original paper. #31

Closed zhouyuan888888 closed 4 years ago

zhouyuan888888 commented 4 years ago

Hi, thank you for releasing the code for EMANet. I find a difference between the code and the paper. The difference lies in the formulation of Equation 13 (in the paper). In the paper, the M step (bases reconstruct) is formulated as follows: image However, in the code, the M step is formulated as: mu = torch.bmm(x, z)
Actually, mu = torch.bmm(x, z
) is the weighted summation of X. However, Equation 13 (in the paper) is not the weighted summation of X. Anything wrong in the paper?

zhouyuan888888 commented 4 years ago

In the paper, M step is formulated as: image in which Znk denotes the n-th pixel in the K-th channel in Z (Attention map), Xn denotes the feature vector of the n-th pixel. Why uk is obtained from the multiplication between Znk (scalar) and Xn? Expect for your explanations. Thank you so much.

XiaLiPKU commented 4 years ago

Hi, thank you for releasing the code for EMANet. I find a difference between the code and the paper. The difference lies in the formulation of Equation 13 (in the paper). In the paper, the M step (bases reconstruct) is formulated as follows: image However, in the code, the M step is formulated as: mu = torch.bmm(x, z) Actually, mu = torch.bmm(x, z) is the weighted summation of X. However, Equation 13 (in the paper) is not the weighted summation of X. Anything wrong in the paper?

To be honest, I forgot the \sum symbol in the numerator... When I found that, the deadline of the camera-ready version had gone...

zhouyuan888888 commented 4 years ago

@XiaLiPKU hahaha...., thank you for your explanation.