zhengchen1999 / CAT

PyTorch code for our NeurIPS 2022 paper "Cross Aggregation Transformer for Image Restoration"
Apache License 2.0
116 stars 8 forks source link

GMACs with the 256*256*3 input #5

Closed joshyZhou closed 1 year ago

joshyZhou commented 1 year ago

Hi, Thanks for your great work! Can you share the GMACs with the 2562563 input, where we would like to make a fair comparison.

zhengchen1999 commented 1 year ago

Hi.

for image SR (x4) with 256x256x3 input size, the FLOPs:
CAT-A: 1768.82 G
CAT-R: 1170.86 G

If you have any other problem, please let us know. Thanks.

joshyZhou commented 1 year ago

Thanks for your reply!

joshyZhou commented 1 year ago

Sorry to bother you again. We still wonder if the same Flops with 256x256x3 input size for the image denoise task (e.g, SSID). By the way, here 1170.86 G Flops = 585G MACs?

zhengchen1999 commented 1 year ago

For the image denoising task, we adopt the Restormer architecture (more details, please refer to our paper), and the FLOPs are 289.88G. If you still use the CAT architecture proposed in our paper (we didn't experiment), the FLOPs are 1712.22G.

Also, 2 FLOPs = 1 MAC.

joshyZhou commented 1 year ago

Thanks for your reply. Yet, I'm kind of confused about 'we adopt the Restormer architecture, and the FLOPs are 289.88G.'. We do know that your work is built on Restormer, but is that mean your work shares the same FLOPs with Restormer?

zhengchen1999 commented 1 year ago

For 'we adopt the Restormer architecture' means we employ a 4-level encoder-decoder, following Restormer. Actually, we replace all Transformer blocks in Restormer with our proposed CATB, and build the new network for the image denoising task. The implementation details are provided in our main paper. For 'FLOPs are 289.88G'. It is the computational complexity of this new network, not the Restormer's. The reason for adopting the encoder-decoder architecture used by Restormer is to make a fair comparison with Restormer on the real denoising task. Here we train our model exactly on the training settings of Restormer. Moreover, the core of our work is Rwin-SA and CATB, not the RCAN backbone.

joshyZhou commented 1 year ago

Thanks! It's now clear to me.