Yanfeng-Zhou / XNet

[ICCV2023] XNet: Wavelet-Based Low and High Frequency Merging Networks for Semi- and Supervised Semantic Segmentation of Biomedical Images
158 stars 8 forks source link

About train GlaS Dataset #8

Open kirk0221 opened 8 months ago

kirk0221 commented 8 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively?

And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Yanfeng-Zhou commented 8 months ago

The XNet architecture was written after other baseline models, so the in_channels parameter is habitually retained and the default value is set to 3. In actual use, you can adjust it according to your own dataset.

For the GlaS dataset, the input is [Batch size, 1, 128, 128],the output is [Batch size, 2, 128, 128].

By the way, the wavelet transform for RGB three-channel images is very flexible. There are two strategies: (1) You can first convert the three-channel image into a grayscale image, and then perform wavelet transform to generate L and H, so that L and H are the single channel images. (2) You can also perform wavelet transformation on the three channels R, G, and B respectively, and concatenate the results, so that L and H are three-channel images.

kirk0221 commented 8 months ago

Thank you for your kind answer. Also, if you import and use a mask from GlaS Dataset, it comes out as a [128,128] tensor, so I can't calculate Dice loss by output_train of [Batch size, 2, 128, 128]. Does GlaS dataset require additional preprocessing?

Yanfeng-Zhou commented 8 months ago

Can you give me the error print?

kirk0221 commented 8 months ago

This is my tensor shape for training. inputs_train_shape1 = torch.Size([2, 1, 128, 128]) inputs_train_shape2 = torch.Size([2, 1, 128, 128]) outputs_train_shape1 = torch.Size([2, 2, 128, 128]) outputs_train_shape2 = torch.Size([2, 2, 128, 128]) mask_train = torch.Size([128, 128])

And this is my error print. Traceback (most recent call last): File "train_sup_XNet.py", line 235, in File "train_sup_XNet.py", line 235, in loss_train_sup1 = criterion(outputs_train1, mask_train) File "/home/gpuadmin/anaconda3/envs/xnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 140, in forward loss_train_sup1 = criterion(outputs_train1, mask_train) File "/home/gpuadmin/anaconda3/envs/xnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl return self._base_forward(output, target_one_hot, valid_mask) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 107, in _base_forward result = self.forward(input, **kwargs) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 140, in forward return self._base_forward(output, target_one_hot, valid_mask) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 107, in _base_forward dice_loss = dice(predict[:, i], target[..., i], valid_mask) dice_loss = dice(predict[:, i], target[..., i], valid_mask) IndexError: index 2 is out of bounds for dimension 1 with size 2 IndexError: index 2 is out of bounds for dimension 1 with size 2

Yanfeng-Zhou commented 8 months ago

mask_train = torch.Size([128, 128])? This seems to be because the shape of the mask is inconsistent with the shape of the image. mask is missing the dimension of batch size

kirk0221 commented 7 months ago

I check it, then i have same error

input_train_shape1 = torch.Size([2, 1, 128, 128]) input_train_shape2 = torch.Size([2, 1, 128, 128]) mask_train = torch.Size([2, 128, 128]) outputs_train_shape1 = torch.Size([2, 2, 128, 128]) outputs_train_shape2 = torch.Size([2, 2, 128, 128])

Traceback (most recent call last): File "train_sup_XNet.py", line 235, in File "train_sup_XNet.py", line 235, in loss_train_sup1 = criterion(outputs_train1, mask_train) File "/home/gpuadmin/anaconda3/envs/xnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 140, in forward return self._base_forward(output, target_one_hot, valid_mask) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 107, in _base_forward dice_loss = dice(predict[:, i], target[..., i], valid_mask) IndexError: index 2 is out of bounds for dimension 1 with size 2 loss_train_sup1 = criterion(outputs_train1, mask_train) File "/home/gpuadmin/anaconda3/envs/xnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 140, in forward return self._base_forward(output, target_one_hot, valid_mask) File "/data/LAB/cha/XNet-main/loss/loss_function.py", line 107, in _base_forward dice_loss = dice(predict[:, i], target[..., i], valid_mask) IndexError: index 2 is out of bounds for dimension 1 with size 2

kirk0221 commented 7 months ago

I have one more question from the author. I wonder why the output channel comes out as 2 according to NUM_CLASSES in For the GlaS data set, the input is [Batch size, 1, 128, 128] and the output is [Batch size, 2, 128, 128] that you mentioned last time. And in each image, the mask is 1 channel, and I wonder how it is used to calculate the 2 channel image and loss.

Yanfeng-Zhou commented 7 months ago

Before calculating loss, the code will perform one hot for the mask (loss/loss_function.py line 119 ), which means that the mask will be converted from [Batch size,1,128,128] to [Batch size,NUM_CLASSES,128,128].

Yanfeng-Zhou commented 7 months ago

I checked the raw dataset again and I think I know where your error is. You did not preprocess the mask of the raw dataset. Please note that the raw mask is instance segmentation annotation.

kirk0221 commented 7 months ago

So how do you do additional preprocessing on the GlaS mask? The background of my GlaS mask is 0, and the number of each class is going up in order from 1. Or can I get the dataset that the author used for the experiment??

Yanfeng-Zhou commented 7 months ago

Preprocessing is as simple as you said, foreground = 1, background = 0.

kirk0221 commented 7 months ago

Then, isn't it the shape of the tensor of [2,128,128] corresponding to NUM_CLASSES?? If you need to make it in that shape, please tell me how to pre-process it. thank you.

kirk0221 commented 6 months ago

Author, the problem I asked earlier has been solved. I have a new question. In wavelet2D, is it possible for us to change the threshold other than specifying --wavelet_type?

yangz9527 commented 3 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively?

And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Hello, I would like to ask if you sliced the dataset using a python file. The dataset downloaded from the official website is in HDF format. Is it necessary to convert it into a image using a python file before processing it? Also, what is the step size you set when processing the dataset for slicing

Yanfeng-Zhou commented 3 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively? And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Hello, I would like to ask if you sliced the dataset using a python file. The dataset downloaded from the official website is in HDF format. Is it necessary to convert it into a image using a python file before processing it? Also, what is the step size you set when processing the dataset for slicing

The GlaS dataset does not seem to require the kind of preprocessing you mentioned. The images I downloaded are in bmp format.

yangz9527 commented 3 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively? And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Hello, I would like to ask if you sliced the dataset using a python file. The dataset downloaded from the official website is in HDF format. Is it necessary to convert it into a image using a python file before processing it? Also, what is the step size you set when processing the dataset for slicing

The GlaS dataset does not seem to require the kind of preprocessing you mentioned. The images I downloaded are in bmp format.

作者大大您好 非常感谢您这么及时回复 我能问一下在处理CREMI数据集的时候用滑动窗口裁剪数据集 这个有步长的设置吗 还是说就用256x256规格对原数据集进行裁剪

Yanfeng-Zhou commented 3 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively? And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Hello, I would like to ask if you sliced the dataset using a python file. The dataset downloaded from the official website is in HDF format. Is it necessary to convert it into a image using a python file before processing it? Also, what is the step size you set when processing the dataset for slicing

The GlaS dataset does not seem to require the kind of preprocessing you mentioned. The images I downloaded are in bmp format.

作者大大您好 非常感谢您这么及时回复 我能问一下在处理CREMI数据集的时候用滑动窗口裁剪数据集 这个有步长的设置吗 还是说就用256x256规格对原数据集进行裁剪

For train set, a sliding window with an overlap rate of 0.25 is used for sampling and pure negative examples are removed. For test set, a sliding window without overlap is used for sampling.

yangz9527 commented 3 months ago

I had a problem learning XNet, your great research. The GlaS dataset, a 2D image, was preprocessed using wavelet2D.py to obtain LF and HF images of [1,128,128] testers, respectively. After passing through the XNet model, are the shapes of outputs_train1 and outputs_train2 [2,128,128] respectively? And when we learn GlaS dataset on branch1 and branch2 on the XNet model, will 3 and 1 channels be right?

# branch 1
self.b1_1_1 = nn.Sequential(
conv3x3(in_channels, l1c), #in_channels = 3
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

# branch 2
self.b2_1_1 = nn.Sequential(
conv3x3(1, l1c),
conv3x3(l1c, l1c),
BasicBlock(l1c, l1c)

Thank you for your answers!!

Hello, I would like to ask if you sliced the dataset using a python file. The dataset downloaded from the official website is in HDF format. Is it necessary to convert it into a image using a python file before processing it? Also, what is the step size you set when processing the dataset for slicing

The GlaS dataset does not seem to require the kind of preprocessing you mentioned. The images I downloaded are in bmp format.

作者大大您好 非常感谢您这么及时回复 我能问一下在处理CREMI数据集的时候用滑动窗口裁剪数据集 这个有步长的设置吗 还是说就用256x256规格对原数据集进行裁剪

For train set, a sliding window with an overlap rate of 0.25 is used for sampling and pure negative examples are removed. For test set, a sliding window without overlap is used for sampling.

Glas数据集官方的链接似乎已经失效了 作者这边能提供一份网盘的链接吗

Ystartff commented 3 months ago

Before calculating loss, the code will perform one hot for the mask (loss/loss_function.py line 119 ), which means that the mask will be converted from [Batch size,1,128,128] to [Batch size,NUM_CLASSES,128,128].

Hello! I have also encountered this type of problem during my training. Can you please offer your help?