speedinghzl / CCNet

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).
MIT License
1.42k stars 277 forks source link

A Question about FLOPs #2

Closed eugenelawrence closed 5 years ago

eugenelawrence commented 5 years ago

Nice Work! Can you provide the code or details about how to calculate the flops of the CCNet?

speedinghzl commented 5 years ago

@eugenelawrence Thanks for your attention. In the paper, we calculate the floating point operations (FLOPs) of the RCCA module rather than the whole network. We follow this paper to calculate FLOPs. image The Input size of CCNet is 1x3x769x769, the input feature maps of RCCA has shape 1x512x97x97 (after channel reduction). Three 1x1 convolutions: 2x97x97x(512+1)x64 + 2x97x97x(512+1)x64 + 2x97x97x(512+1)x512 Affinty: 2x97x97x(97+97)x64 Aggregation: 2x97x97x(97+97)x512 Besides, the softmax and add are operations with light weight calculation. Finally, the FLOPs are about 8 x 10^9.

zhangpj commented 3 years ago

@speedinghzl Hi, thank you for sharing this project. In your paper, you also compare the FLOPs of CCNet with nonlocal, you report the FLOPs of nonlocal is 108G, can you tell me what is the input size of nonlocal in your experiment, do you reduce the number of channels by conv g ,thetaor phi, or use subsampling tricks?

speedinghzl commented 3 years ago

@zhangpj The input size of nonlocal is also 1x512x97x97. To make a fair comparison, I did reduce the number of channels and did not use a subsampling trick, which shared the same settings with RCCA.

zhangpj commented 3 years ago

@speedinghzl All right, thank you.