HuCaoFighting / Swin-Unet

[ECCVW 2022] The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"
1.58k stars 298 forks source link

Swin-Unet进行视频预测 #104

Open 740402059 opened 7 months ago

740402059 commented 7 months ago

请问Swin-Unet模型可以在输出端改动,用作视频预测任务吗? 我在自己的数据集上测试,最后的结果和前面的issue问题相同,很不平滑。我尝试在输入端和输出端增加了卷积,但仍然有这个问题。

HuCaoFighting commented 7 months ago

改变输出端是可以进行video相关的任务的。对于平滑问题,你可以贴一下具体的图么?

740402059 commented 7 months ago

patch_nonoverlap

Thank you very much for your response. I found a highly representative prediction result, where the first row represents the actual measurements, and the second row represents the predictions. There is a significant issue of chunking in this result. The model I used is a modification based on your Swin-Net.

740402059 commented 7 months ago

Test_Pangu_0

In addition, I made some adjustments to the patch merge process, incorporating only linear transformations in terms of dimensions. Additionally, I added 3x3 convolutions and deconvolutions to the input and output, which slightly alleviated the issue but the blockiness still persists.

740402059 commented 7 months ago

![Uploading Test_SimVP_Swin_0.png…]()

另外最重要的一点是我数据样本很少,划分成视频后的样本片段只有1400多个,是不是因为样本的问题导致transformer的预测效果很差?我用cnn模型预测的结果就很好很清晰

HuCaoFighting commented 7 months ago

你用了pretrained权重初始化么?

740402059 commented 7 months ago

没有用,我都是在自己的数据上重新训练的

HuCaoFighting commented 7 months ago

pure transoformer的方法比较依赖与训练的权重。你可以加载下再train 试试

740402059 commented 7 months ago

好的,我尝试一下

740402059 commented 7 months ago

我把变量维度和时间维度合并到同一个维度,输出层类别同样改为变量维度和时间维度,其他swin-unet的结构不变。输入也裁剪到224,除了输入输出通道,参数保持一致,这是我加载预训练前后的结果。 不加预训练 ![Uploading Test_ST-Unet_不加预训练0.png…]() 增加预训练 ![Uploading Test_ST-Unet_增加预训练0.png…]() 还是有不平滑的分块现象。因为我自己的训练样本很少,就1400多个。

fenghuohuo2001 commented 6 months ago

我也存在不平滑的现象,使用预训练权重和不使用都存在这个问题,效果没有CNN好