YubiaoYue / MedMamba

This is the official code repository for "MedMamba: Vision Mamba for Medical Image Classification"
297 stars 24 forks source link

Why using "channel_shuffle" instead of "self.finalconv11" when fuse the outputs of two branches? #11

Open KyotoSakura opened 4 months ago

KyotoSakura commented 4 months ago

Hi, the Conv-SSM Block you draw in the paper has a Conv 2D 1x1 block when merging the outputs of the two branches. 1714033491954 But I see the code of MedMamba.py def forward(self, input: torch.Tensor): input_left, input_right = input.chunk(2,dim=-1) x = self.drop_path(self.self_attention(self.ln_1(input_right))) input_left = input_left.permute(0,3,1,2).contiguous() input_left = self.conv33conv33conv11(input_left) input_left = input_left.permute(0,2,3,1).contiguous() output = torch.cat((input_left,x),dim=-1) output = channel_shuffle(output,groups=2) return output+input You use channel_shuffle instead of self.finalconv11 = nn.Conv2d(in_channels=hidden_dim, out_channels=hidden_dim, kernel_size=1, stride=1) Could you please give an explanation about that?

YubiaoYue commented 4 months ago

Hello, please review the latest version of the paper and network structure. Medmamba uses shuffling operations for channel information fusion.