chenqi1126 / SIPE

[CVPR 2022] Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation
MIT License
75 stars 10 forks source link

About the implementation of mask average pool. #5

Closed lartpang closed 2 years ago

lartpang commented 2 years ago

https://github.com/chenqi1126/SIPE/blob/e0c6a3bc578a6bade2cf85fe34a1793dce71170d/network/resnet50_SIPE.py#L58

image

The denominator of the two is not the same. The code is the area of the entire feature map (h*w), while the formula is the area of the foreground.

chenqi1126 commented 2 years ago

Hi pang, the 'crop_feature' only contains the features of the foreground since the unrelated parts have been eliminated by seeds (see Line 56-57).

lartpang commented 2 years ago

@chenqi1126

Thank you for your reply.

https://github.com/chenqi1126/SIPE/blob/e0c6a3bc578a6bade2cf85fe34a1793dce71170d/network/resnet50_SIPE.py#L56-L58

Although the background is eliminated by seed here, this only corresponds to the calculation process of the numerator in the formula. But the denominator is calculated differently than the formula. According to the meaning of the formula, mask average pooling only calculates the average value on a specific area (R^i_k=1), while in the current implementation, it is calculated on the entire feature, which leads to a larger denominator.

chenqi1126 commented 2 years ago

@lartpang

Thanks for your attention.

Yes - It is a little different between formula and implementation. As you mentioned, the implementation would lead to a larger denominator and produce a scaled value compared to the formula. But this scaled value does not affect the final result since it is normalized in the following similarity calculation. (see below, the official PyTorch document is here.) Current implementation simplifies the calculation and avoids division by zero caused by unstable seeds in the early training stage. https://github.com/chenqi1126/SIPE/blob/e0c6a3bc578a6bade2cf85fe34a1793dce71170d/network/resnet50_SIPE.py#L62

lartpang commented 2 years ago

Ok, got it, thanks!