JCruan519 / VM-UNet

(ARXIV24) This is the official code repository for "VM-UNet: Vision Mamba UNet for Medical Image Segmentation".
Apache License 2.0
393 stars 12 forks source link

Confusion about the different archetectures between code of VSSBlock and figure in paper of VSSBlock #56

Open Allenem opened 1 month ago

Allenem commented 1 month ago

https://github.com/JCruan519/VM-UNet/blob/acbafd5ad5b275a0ddee5a449e2adeeb2243d6a4/models/vmunet/vmamba.py#L492

Hi, Thank you very much for your excellent work!

I want to know why the forward function of class VSSBlock only includes

1 layer norm, SS2D, drop & addition;

rather than what is ilustrated in paper Fig1 (b) :

2 Layer Norm, 3 Linear layer, 1 DW-Conv, 2 Activation, SS2D, Addition & 1 Element-wise production