JCruan519 / VM-UNet

(ARXIV24) This is the official code repository for "VM-UNet: Vision Mamba UNet for Medical Image Segmentation".
Apache License 2.0
392 stars 12 forks source link

pre-train question #31

Open ChaohuanDeng123 opened 3 months ago

ChaohuanDeng123 commented 3 months ago

Hello, I would like to ask why the VSS block in the decoder stage also needs to use pre-trained weights? As far as I know, pre-trained weights can be used in the encoder stage to extract generic features. Additionally, have there been attempts to train the entire model without using pre-trained weights?

JCruan519 commented 3 months ago

@ChaohuanDeng123 Hello, our initialization method was inspired by Swin-UNet, where they initialized the encoder and decoder of Swin-UNet using pre-trained weights from Swin TransFormer. Additionally, Table 3 in our VM-UNet paper demonstrates results without using pre-trained weights.

CYYJL commented 2 months ago

Hi, May I ask if there is any pre-training weight for the decoding part of VSSM? I try to find it in VMamba, but the model is used in the backbone part in VMamba, and convolutional neural network is used for decoding. However, VMamba's VSSM source code contains the Mamba decoding part, and I do not know whether there is a pre-training weight for the decoding part

JCruan519 commented 1 month ago

@CYYJL Hello, VMamba provides pre-trained weights only for the encoder. However, for VM-UNet, we initialize both the encoder and decoder using VMamba's pre-trained weights, following the approach similar to Swin UNet.