BangguWu / ECANet

Code for ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
MIT License
1.24k stars 197 forks source link

Applying ECA on 3D inputs? #30

Open shakjm opened 4 years ago

shakjm commented 4 years ago

Hi, I was wondering if this could be applied to models dealing with 3D inputs? Would the codes written below be correct? I'm not sure why the codes have squeezed the Width layer out. In 3D input, with X, Y and Z, which dimensions should be squeezed out? Should it be both X and Z? Would the codes below be correct?

 def forward(self, x):
        # x: input features with shape [b, c, h, w]
        b, c, z, h, w = x.size()

        # feature descriptor on the global spatial information
        y = self.avg_pool(x)

        # Two different branches of ECA module
        y = self.conv(y.squeeze(-1).squeeze(-2).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-2).unsqueeze(-1)

        # Multi-scale information fusion
        y = self.sigmoid(y)

        return x * y.expand_as(x)
slighting666 commented 4 years ago

@shakjm I meet the same problem, do you solve it?

shakjm commented 4 years ago

@shakjm I meet the same problem, do you solve it?

Hello, I manage to run it with this set of codes that I've edited. It seems that the author doesn't monitor this page at all. I have tested this with my deep learning application, but my Squeeze & Excitation network still outperforms this method, no matter the slight changes I have done to it

slighting666 commented 4 years ago

@shakjm I meet the same problem, do you solve it?

Hello, I manage to run it with this set of codes that I've edited. It seems that the author doesn't monitor this page at all. I have tested this with my deep learning application, but my Squeeze & Excitation network still outperforms this method, no matter the slight changes I have done to it

I understand that B×C×H×W, then compressed into B×C×1×1, then N×C×1, and then replaced with N×1×C for 1D convolution operation. Is the 3D input B×C×D×H×W also B×C×1×1×1, then N×C×1, and then replace it with N×1×C for 1D convolution operation?

shakjm commented 4 years ago

Yes, you are right, it will be BxCx1x1x1 . The whole operation should focus on the Channel dimension (that is why they did the swap). This is only based on my understanding. So what you have explained tallies with my understanding.

Also, you may refer to this closed topic for better you to understand it better

https://github.com/BangguWu/ECANet/issues/7

slighting666 commented 4 years ago

Yes, you are right, it will be BxCx1x1x1 . The whole operation should focus on the Channel dimension (that is why they did the swap). This is only based on my understanding. So what you have explained tallies with my understanding.

Also, you may refer to this closed topic for better you to understand it better

7

So do experiments based on this idea have any effect, or is it not as effective as SE Net?

shakjm commented 4 years ago

Yes, you are right, it will be BxCx1x1x1 . The whole operation should focus on the Channel dimension (that is why they did the swap). This is only based on my understanding. So what you have explained tallies with my understanding. Also, you may refer to this closed topic for better you to understand it better

7

So do experiments based on this idea have any effect, or is it not as effective as SE Net?

My experiments uses 10-fold cross validation, and I've only tried it for one folder. My experiment uses only a small block of SE Net, and ECA performs very closely to my modified SE block. It performed -0.1% sensitivity as compared to SE block. So this introduced idea did not help my work.. You have to try it with your application to see if it helps.

slighting666 commented 4 years ago

Yes, you are right, it will be BxCx1x1x1 . The whole operation should focus on the Channel dimension (that is why they did the swap). This is only based on my understanding. So what you have explained tallies with my understanding. Also, you may refer to this closed topic for better you to understand it better

7

So do experiments based on this idea have any effect, or is it not as effective as SE Net?

My experiments uses 10-fold cross validation, and I've only tried it for one folder. My experiment uses only a small block of SE Net, and ECA performs very closely to my modified SE block. It performed -0.1% sensitivity as compared to SE block. So this introduced idea did not help my work.. You have to try it with your application to see if it helps.

ok,thanks!

yjsb commented 2 years ago

My code: y = self.conv(y.squeeze(-1).squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1).unsqueeze(-1) Is it differ from yours?

burhr2 commented 1 year ago

Here is the implementation I am using for 3D input, kindly correct me if I am wrong:

from torch import nn
import math

class ECABlock(nn.Module):
    """
    ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
    https://doi.org/10.48550/arXiv.1910.03151
    https://github.com/BangguWu/ECANet
    """

    def __init__(self, n_channels, k_size=3, gamma=2, b=1):
        super(ECABlock, self).__init__()
        self.global_avg_pool = nn.AdaptiveAvgPool3d(1)

    # https://github.com/BangguWu/ECANet/issues/243 
    # dynamically computing the k_size 
        t = int(abs((math.log(n_channels, 2) + b) / gamma))
        k_size = t if t % 2 else t + 1

        self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False) 
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        b, c, _, _, _ = x.size()
        # feature descriptor on the global spatial information
        y = self.global_avg_pool(x)

        # Two different branches of ECA module
    # https://github.com/BangguWu/ECANet/issues/30
    # https://github.com/BangguWu/ECANet/issues/7
        # y = self.conv(y.squeeze(-1).squeeze(-2).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-2).unsqueeze(-1) # b, c, z, h, w = x.size()
        y = self.conv(y.squeeze(-1).squeeze(-2).transpose(-2, -1)).transpose(-2, -1).unsqueeze(-2).unsqueeze(-1) # b, c, w, h, z = x.size()

        # Multi-scale information fusion
        y = self.sigmoid(y)

        return x * y.expand_as(x)