Open ChaohuanDeng123 opened 3 months ago
@ChaohuanDeng123 Hello, our initialization method was inspired by Swin-UNet, where they initialized the encoder and decoder of Swin-UNet using pre-trained weights from Swin TransFormer. Additionally, Table 3 in our VM-UNet paper demonstrates results without using pre-trained weights.
Hi, May I ask if there is any pre-training weight for the decoding part of VSSM? I try to find it in VMamba, but the model is used in the backbone part in VMamba, and convolutional neural network is used for decoding. However, VMamba's VSSM source code contains the Mamba decoding part, and I do not know whether there is a pre-training weight for the decoding part
@CYYJL Hello, VMamba provides pre-trained weights only for the encoder. However, for VM-UNet, we initialize both the encoder and decoder using VMamba's pre-trained weights, following the approach similar to Swin UNet.
Hello, I would like to ask why the VSS block in the decoder stage also needs to use pre-trained weights? As far as I know, pre-trained weights can be used in the encoder stage to extract generic features. Additionally, have there been attempts to train the entire model without using pre-trained weights?