Swin-Unet进行视频预测

HuCaoFighting / Swin-Unet

[ECCVW 2022] The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

1.58k stars 298 forks source link

Swin-Unet进行视频预测 #104

Open 740402059 opened 7 months ago

740402059 commented 7 months ago

请问Swin-Unet模型可以在输出端改动，用作视频预测任务吗？我在自己的数据集上测试，最后的结果和前面的issue问题相同，很不平滑。我尝试在输入端和输出端增加了卷积，但仍然有这个问题。

HuCaoFighting commented 7 months ago

改变输出端是可以进行video相关的任务的。对于平滑问题，你可以贴一下具体的图么？

740402059 commented 7 months ago

patch_nonoverlap

Thank you very much for your response. I found a highly representative prediction result, where the first row represents the actual measurements, and the second row represents the predictions. There is a significant issue of chunking in this result. The model I used is a modification based on your Swin-Net.

740402059 commented 7 months ago

Test_Pangu_0

In addition, I made some adjustments to the patch merge process, incorporating only linear transformations in terms of dimensions. Additionally, I added 3x3 convolutions and deconvolutions to the input and output, which slightly alleviated the issue but the blockiness still persists.

740402059 commented 7 months ago

![Uploading Test_SimVP_Swin_0.png…]()

另外最重要的一点是我数据样本很少，划分成视频后的样本片段只有1400多个，是不是因为样本的问题导致transformer的预测效果很差？我用cnn模型预测的结果就很好很清晰

HuCaoFighting commented 7 months ago

你用了pretrained权重初始化么？

740402059 commented 7 months ago

没有用，我都是在自己的数据上重新训练的

HuCaoFighting commented 7 months ago

pure transoformer的方法比较依赖与训练的权重。你可以加载下再train 试试

740402059 commented 7 months ago

好的，我尝试一下

740402059 commented 7 months ago

我把变量维度和时间维度合并到同一个维度，输出层类别同样改为变量维度和时间维度，其他swin-unet的结构不变。输入也裁剪到224，除了输入输出通道，参数保持一致，这是我加载预训练前后的结果。 不加预训练 ![Uploading Test_ST-Unet_不加预训练0.png…]() 增加预训练 ![Uploading Test_ST-Unet_增加预训练0.png…]() 还是有不平滑的分块现象。因为我自己的训练样本很少，就1400多个。

fenghuohuo2001 commented 6 months ago

我也存在不平滑的现象，使用预训练权重和不使用都存在这个问题，效果没有CNN好