bigmb / Unet-Segmentation-Pytorch-Nest-of-Unets

Implementation of different kinds of Unet Models for Image Segmentation - Unet , RCNN-Unet, Attention Unet, RCNN-Attention Unet, Nested Unet
MIT License
1.87k stars 345 forks source link

Bad performance in single-channel input data #37

Closed SkeletonOne closed 4 years ago

SkeletonOne commented 4 years ago

Hi! Firstly, appreciate for your great work. However, I ran into some problems while using this project.

My data is NCI ISBI 2013, which is a medical segmentation dataset and the data is 3D MRI images(in .dicom and .nrrd format). Following your instructions in the readme, I tried to split the 3D data to 2D png images, but the thing is that every slice of the 3D data is a 1-channel 2D grey image. I failed to transform the data to 3-channel images(such as using cv2.COLOR_GRAY2RGB, or stack the array to 3hw). So I changed the code from ''' torchvision.transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)) ''' to ''' torchvision.transforms.Normalize(mean=(0.5), std=(0.5)) ''' and it works finally. But the loss is always around 0.45, and the final dice in test data after 500 epochs is about 0.06, which is very low. So can you give me some help? Appreciate it a lot!

bigmb commented 4 years ago

You can convert it to 3d by joining 1-channel images upon itself. It will be then 3 channel image.

Also whats the size of the dataset? It shouldn't be such a low dice score on a good dataset. So can you use nested unet and test it out.

SkeletonOne commented 4 years ago

@bigmb Thanks for your reply. Actually, I tried the method you said, with ''' img = np.stack((img,)*3, axis=-1) print(img.shape) # (400,400,3) cv2.imwrite('./imgs/train/'+str(count)+'.jpg',img) ''' But with the images generated from this code, it still meets 'RuntimeError: output with shape [1, h, w] doesn't match the broadcast shape [3, h, w]'. So I don't know why...

Besides, the training set has 1544 images and test set has 271 images.

Some outputs during training:

Epoch: 492/500 Training Loss: 0.465895 Validation Loss: 0.492391 0m 28s Epoch: 493/500 Training Loss: 0.475212 Validation Loss: 0.463547 0m 28s Epoch: 494/500 Training Loss: 0.469650 Validation Loss: 0.465150 0m 29s Epoch: 495/500 Training Loss: 0.482380 Validation Loss: 0.454328 0m 29s Epoch: 496/500 Training Loss: 0.472040 Validation Loss: 0.468342 0m 29s Epoch: 497/500 Training Loss: 0.471428 Validation Loss: 0.468051 0m 29s Epoch: 498/500 Training Loss: 0.458327 Validation Loss: 0.462537 0m 29s Epoch: 499/500 Training Loss: 0.457634 Validation Loss: 0.463168 0m 29s Epoch: 500/500 Training Loss: 0.469844 Validation Loss: 0.462943 0m 29s

Dice Score : 0.06005357188816441

bigmb commented 4 years ago

If your IMG is 3 channels, it shouldn't give this issue.

Try the one I was using. """ import cv2 import numpy as np img = cv2.imread('10524.jpg') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) img2 = np.zeros_like(img) img2[:,:,0] = gray img2[:,:,1] = gray img2[:,:,2] = gray cv2.imwrite('10524.jpg', img2) """ Also the training dataset is not that big, but its reasonable enough for medical dataset. And what is SOTA results of the dataset?

SkeletonOne commented 4 years ago

@bigmb Thanks for your reply. I want to try to use 'gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)', but actually my image was generated from a H-W-N 3D volume, so each image is a H-W numpy array, and I cannot do BGR2GRAY for it. I made a H-W-3 numpy array filled with zero, and use ''' print(img2.shape) # (400, 400, 3) img2[:,:,0] = gray img2[:,:,1] = gray img2[:,:,2] = gray cv2.imwrite('./imgs/train/'+str(count)+'.jpg',img2) '''

But with the saved image, still meets 'RuntimeError: output with shape [1, 400, 400] doesn't match the broadcast shape [3, 400, 400]'.

I tried to look at the image shape, and get the result: ''' a = cv2.imread('./imgs/train/1.jpg') print(a.shape) # (400, 400, 3) '''

So I am very confused now.... maybe about the code about reading the data? I have no idea.

The SOTA results are about dice score 0.8-0.9, so it means using single-channel the network wasn't working...

bigmb commented 4 years ago

If the image shape is (400,400,3) then it shouldn't be a problem. Which line is causing the issue? If you do 'torchsummary.summary(model_test, input_size=(3, 400,400))' , then whats the output?

""" for i in range(len(readfolderT)): y_folder = readfolderT[i] yread = sitk.ReadImage(y_folder) yimage = sitk.GetArrayFromImage(yread) x_np = np.empty([184,232,3]) x = yimage[:184, :232, 110:140] x = scipy.rot90(x) x = scipy.rot90(x) for j in range(x.shape[2]): x_np[:, :, 0] = (x[:184, :232, j]) x_np[:, :, 1] = (x[:184, :232, j]) x_np[:, :, 2] = (x[:184, :232, j])

scipy.misc.imsave('conc.tif', x_np)

    #TrainingImagesList.append((x[:184, :224, j]))
    #xchangeL = x_np.tolist()
    xchangeL = cv2.resize(x_np, (128, 128))
    scipy.misc.imsave('/home/bat161/Desktop/Thesis/Image/test_3channel/' + str(i)+ '_' +str(j) + '.png', xchangeL)

"""

This is what I used at that time. to convert 1 channel to 3 channels.

Also, did you check the prediction folder? How are the results of the same? Are they segmenting at the right location?