ellisdg / 3DUnetCNN

Pytorch 3D U-Net Convolution Neural Network (CNN) designed for medical image segmentation
MIT License
1.9k stars 653 forks source link

when i run UnetTraining,i has so many problem. #2

Closed mjiansun closed 7 years ago

mjiansun commented 7 years ago

In UnetTraining line151, "model_file = os.path.abspath("3d_unet_model.h5")", where is "3d_unet_model.h5"? In UnetTraining line 165,"processed_list_file = os.path.abspath("processed_subjects.pkl")",where is "processed_subjects.pkl"?

so, when i run this .py file, i meet this problem: /usr/bin/python2.7 /home/s/myproject/3DUnetCNN-master/UnetTraining.py Using TensorFlow backend. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:99] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH: /home/s/pycharm/pycharm-2016.3/bin:/home/s/git/torch/install/lib: I tensorflow/stream_executor/cuda/cuda_dnn.cc:1562] Unable to load cuDNN DSO I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally /usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning) Traceback (most recent call last): File "/home/s/myproject/3DUnetCNN-master/UnetTraining.py", line 217, in main(overwrite=False) File "/home/s/myproject/3DUnetCNN-master/UnetTraining.py", line 155, in main model = unet_model() File "/home/s/myproject/3DUnetCNN-master/UnetTraining.py", line 55, in unet_model conv1 = Conv3D(32, 3, 3, 3, activation='relu', border_mode='same')(inputs) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 517, in call self.add_inbound_node(inbound_layers, node_indices, tensor_indices) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 571, in add_inbound_node Node.create_node(self, inbound_layers, node_indices, tensor_indices) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0])) File "/usr/local/lib/python2.7/dist-packages/keras/layers/convolutional.py", line 1219, in call filter_shape=self.W_shape) File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1787, in conv3d x = tf.nn.conv3d(x, kernel, strides, padding) AttributeError: 'module' object has no attribute 'conv3d'

I wish your help, pls !

ellisdg commented 7 years ago

@mjiansun my code just provides the framework for building the model. In order to make it work, you will have to adjust the code to fit your own data.

As far as the error goes, I am using Theano and not Tensorflow. See https://keras.io/backend/ to switch the backend for Keras.

mjiansun commented 7 years ago

@ellisdg UnetTraining why your default parameter does not run successfully? i feel my 'unet_model()' is bad, but i dont know that's why?I need your help! Can you give me your unet_model() parameter?Please. Thank you very much!

I always have following problem, i feel crazy...

Using Theano backend. Using gpu device 0: GeForce GTX 960 (CNMeM is disabled, cuDNN 5005) Traceback (most recent call last): File "UnetTraining.py", line 217, in main(overwrite=False) File "UnetTraining.py", line 155, in main model = unet_model() File "UnetTraining.py", line 74, in unet_model up6 = merge([UpSampling3D(size=pool_size)(conv5), conv4], mode='concat', concat_axis=1) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1536, in merge name=name) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1170, in init node_indices, tensor_indices) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1237, in _arguments_validation 'Layer shapes: %s' % (input_shapes)) Exception: "concat" mode can only merge layers with matching output shapes except for the concat axis. Layer shapes: [(None, 512, 18, 30, 30), (None, 512, 19, 30, 30)]

ellisdg commented 7 years ago

What is the shape of the input images that you are using?

mjiansun commented 7 years ago

(154,240,240)

ellisdg commented 7 years ago

The Error is saying that the model cannot concatenate layers of mismatching shapes at this line. I was able to hack around this issue by cropping my images down to shape (144, 240, 240).

Keep in mind though, that this might not be the best approach as it can remove useful data. A different approach could be to pad the images before merging UpSampling3D with conv4. I haven't tried that approach personally, so I don't know that it works

mjiansun commented 7 years ago

Thank you very much, I'll try.

mjiansun commented 7 years ago

I change my data.shape to (144,240,240) and run Unettraining.py, but it runs error.Maybe my data's format is incorrect.

Only one training data is from https://github.com/Kamnitsask/deepmedic/tree/master/examples/dataForExamples/brats2015TrainingData/test/brats_2013_pat0001_1 , i copy T1c_subtrMeanDivStd.nii.gz to T1.nii.gz , then
T1c_subtrMeanDivStd.nii.gz has been modified to T1c.nii.gz Flair_subtrMeanDivStd.nii.gz has been modified to Flair.nii.gz OTMultiClass.nii.gz has been modified to truth.nii.gz brainmask.nii.gz has been modified to background.nii.gz

Only one test data from https://github.com/Kamnitsask/deepmedic/tree/master/examples/dataForExamples/brats2015TrainingData/test/brats_2013_pat0001_1, i copy
T1c_subtrMeanDivStd.nii.gz to T1_144_test.nii.gz , then
T1c_subtrMeanDivStd.nii.gz has been modified to T1c_144_test.nii.gz Flair_subtrMeanDivStd.nii.gz has been modified to Flair_144_test.nii.gz OTMultiClass.nii.gz has been modified to truth_144_test.nii.gz brainmask.nii.gz has been modified to background_144_test.nii.gz

On my own computer, training data is placed on the '/home/s/myproject/3DUnetCNN-master/data/train' testing data is placed on the '/home/s/myproject/3DUnetCNN-master/data/test'

my based parameter: pool_size = (2, 2, 2) image_shape = (144, 240, 240) n_channels = 1 input_shape = tuple([n_channels] + list(image_shape)) n_labels = 5 batch_size = 1 n_test_subjects = 1 z_crop = 155 - image_shape[0] training_iterations = 5

training data path is modified: def read_subject_folder(folder): flair_image = sitk.ReadImage(os.path.join("/home/s/myproject/3DUnetCNN-master/data/train", "Flair.nii.gz")) t1_image = sitk.ReadImage(os.path.join("/home/s/myproject/3DUnetCNN-master/data/train", "T1.nii.gz")) t1c_image = sitk.ReadImage(os.path.join("/home/s/myproject/3DUnetCNN-master/data/train", "T1c.nii.gz")) truth_image = sitk.ReadImage(os.path.join("/home/s/myproject/3DUnetCNN-master/data/train", "truth.nii.gz")) background_image = sitk.ReadImage(os.path.join("/home/s/myproject/3DUnetCNN-master/data/train", "background.nii.gz")) return np.array([sitk.GetArrayFromImage(t1_image), sitk.GetArrayFromImage(t1c_image), sitk.GetArrayFromImage(flair_image), sitk.GetArrayFromImage(truth_image), sitk.GetArrayFromImage(background_image)])

Data folder is modified: def get_subject_dirs(): return glob.glob("/home/s/myproject/3DUnetCNN-master/data//")

pls help me !

ellisdg commented 7 years ago

It looks like the data from DeepMedic is just using T1c (that is a T1 weighted scan with contrast) and FLAIR images.

n_channels is the number of modalities (T1c, FLAIR, etc.) that you want your model to use. I am trying train a model using 3 modalities, T1c, T1, and FLAIR. The BRATS dataset also contains T2 scans which you can use as well. I would change your code to:

n_channels = 2
truth_channel = 2
def read_subject_folder(folder):
    t1c_image = sitk.ReadImage(os.path.join(folder, "T1c_subtrMeanDivStd.nii.gz"))
    flair_image = sitk.ReadImage(os.path.join(folder, "Flair_subtrMeanDivStd.nii.gz"))
    truth_image = sitk.ReadImage(os.path.join(folder, "OTMultiClass.nii.gz"))
    return np.asarray([sitk.GetArrayFromImage(t1c_image), 
                                  sitk.GetArrayFromImage(flair_image),
                                  sitk.GetArrayFromImage(truth_image)])

background_image is actually the opposite of the brainmask. The brainmask contains 1s at every voxel within the brain, while the background image contains 1s at every voxel that is not within the brain. I was using the brackground image to try and avoid cropping non-background voxels out of the other images. I think I am going to take this out of my code though as it has the potential to shift the brain in unexpected directions.

def crop_data(data, z_crop):
    return data[:, z_crop:]

As of last week, I was still having trouble getting solid training and testing scores. I won't be able to run more experiments and figure out what is going wrong until sometime next week. If you want something that is already tried and true, I would look at this 2D U-Net CNN.

Best of luck!

mjiansun commented 7 years ago

Thank you very much for your help.

mjiansun commented 7 years ago

I modify my code,but it point out 'out of memery'

This is my code:

import os import glob import pickle import datetime

import numpy as np

from keras.layers import (Conv3D, AveragePooling3D, MaxPooling3D, Activation, UpSampling3D, merge, Input, Reshape, Permute) from keras import backend as K from keras.models import Model, load_model from keras.optimizers import Adam

import SimpleITK as sitk

pool_size = (2, 2, 2) image_shape = (144, 240, 240) n_channels = 2 input_shape = tuple([n_channels] + list(image_shape)) n_labels = 5 batch_size = 1 n_test_subjects = 1 z_crop = 155 - image_shape[0] training_iterations = 5

def pickle_dump(item, out_file): with open(out_file, "wb") as opened_file: pickle.dump(item, opened_file)

def pickle_load(in_file): with open(in_file, "rb") as opened_file: return pickle.load(opened_file)

K.set_image_dim_ordering('th') smooth = 1.

def dice_coef(y_true, y_pred): y_true_f = K.flatten(y_true) y_pred_f = K.flatten(y_pred) intersection = K.sum(y_true_f y_pred_f) return (2. intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

def dice_coef_loss(y_true, y_pred): return -dice_coef(y_true, y_pred)

def unet_model(): inputs = Input(input_shape)

conv3d(x, kernel, strides=(1, 1, 1), border_mode='valid', dim_ordering='default', volume_shape=None, filter_shape=None)

conv1 = Conv3D(32, 3, 3, 3, activation='relu', border_mode='same')(inputs)
conv1 = Conv3D(32, 3, 3, 3, activation='relu', border_mode='same')(conv1)
#pool3d(x, pool_size, strides=(1, 1, 1), border_mode='valid', dim_ordering='default', pool_mode='max')
pool1 = MaxPooling3D(pool_size=pool_size)(conv1)

conv2 = Conv3D(64, 3, 3, 3, activation='relu', border_mode='same')(pool1)
conv2 = Conv3D(64, 3, 3, 3, activation='relu', border_mode='same')(conv2)
pool2 = MaxPooling3D(pool_size=pool_size)(conv2)

conv3 = Conv3D(128, 3, 3, 3, activation='relu', border_mode='same')(pool2)
conv3 = Conv3D(128, 3, 3, 3, activation='relu', border_mode='same')(conv3)
pool3 = MaxPooling3D(pool_size=pool_size)(conv3)

conv4 = Conv3D(256, 3, 3, 3, activation='relu', border_mode='same')(pool3)
conv4 = Conv3D(256, 3, 3, 3, activation='relu', border_mode='same')(conv4)
pool4 = MaxPooling3D(pool_size=pool_size)(conv4)

conv5 = Conv3D(512, 3, 3, 3, activation='relu', border_mode='same')(pool4)
conv5 = Conv3D(512, 3, 3, 3, activation='relu', border_mode='same')(conv5)

up6 = merge([UpSampling3D(size=pool_size)(conv5), conv4], mode='concat', concat_axis=1)
conv6 = Conv3D(256, 3, 3, 3, activation='relu', border_mode='same')(up6)
conv6 = Conv3D(256, 3, 3, 3, activation='relu', border_mode='same')(conv6)

up7 = merge([UpSampling3D(size=pool_size)(conv6), conv3], mode='concat', concat_axis=1)
conv7 = Conv3D(128, 3, 3, 3, activation='relu', border_mode='same')(up7)
conv7 = Conv3D(128, 3, 3, 3, activation='relu', border_mode='same')(conv7)

up8 = merge([UpSampling3D(size=pool_size)(conv7), conv2], mode='concat', concat_axis=1)
conv8 = Conv3D(64, 3, 3, 3, activation='relu', border_mode='same')(up8)
conv8 = Conv3D(64, 3, 3, 3, activation='relu', border_mode='same')(conv8)

up9 = merge([UpSampling3D(size=pool_size)(conv8), conv1], mode='concat', concat_axis=1)
conv9 = Conv3D(32, 3, 3, 3, activation='relu', border_mode='same')(up9)
conv9 = Conv3D(32, 3, 3, 3, activation='relu', border_mode='same')(conv9)

conv10 = Conv3D(n_labels, 1, 1, 1)(conv9)
act = Activation('sigmoid')(conv10)

model = Model(input=inputs, output=act)

model.compile(optimizer=Adam(lr=1e-5), loss=dice_coef_loss, metrics=[dice_coef])

return model

def train_batch(batch, model): x_train = batch[:,:2] y_train = get_truth(batch) del(batch) print(model.train_on_batch(x_train, y_train)) del(x_train) del(y_train)

def read_subject_folder(folder): flair_image = sitk.ReadImage(os.path.join(folder, "Flair.nii.gz")) t1c_image = sitk.ReadImage(os.path.join(folder, "T1c.nii.gz")) truth_image = sitk.ReadImage(os.path.join(folder, "truth.nii.gz")) return np.array([sitk.GetArrayFromImage(t1c_image), sitk.GetArrayFromImage(flair_image), sitk.GetArrayFromImage(truth_image)])

def crop_data(data, z_crop): return data[:, z_crop:]

def get_truth(batch, truth_channel=2): truth = np.array(batch)[:, truth_channel] batch_list = [] for sample_number in range(truth.shape[0]): sample_list = [] for label in range(n_labels): array = np.zeros(truth[sample_number].shape) array[truth[sample_number] == label] = 1 sample_list.append(array) batch_list.append(sample_list) return np.array(batch_list)

def get_subject_id(subject_dir): return subjectdir.split("")[-2]

def main(overwrite=False): model_file = os.path.abspath("3d_unet_model.h5") if not overwrite and os.path.exists(model_file): model = load_model(model_file, custom_objects={'dice_coef_loss': dice_coef_loss, 'dice_coef': dice_coef}) else: model = unet_model() train_model(model, model_file, overwrite=overwrite, iterations=training_iterations)

def get_subject_dirs(): return glob.glob("/home/s/myproject/3DUnetCNN-master/data/test/*")

def train_model(model, model_file, overwrite=False, iterations=1): for i in range(iterations): processed_list_file = os.path.abspath("processed_subjects.pkl") if overwrite or not os.path.exists(processed_list_file) or i > 0: processed_list = [] else: processed_list = pickle_load(processed_list_file)

    subject_dirs = get_subject_dirs()

    testing_ids_file = os.path.abspath("testing_ids.pkl")  

    if os.path.exists(testing_ids_file) and not overwrite:
        testing_ids = pickle_load(testing_ids_file)
        if len(testing_ids) > n_test_subjects:
            testing_ids = testing_ids[:n_test_subjects]
            pickle_dump(testing_ids, testing_ids_file)
    else:
        subjects = dict()
        for dirname in subject_dirs:
            subjects[dirname.split('_')[-2]] = dirname   #

        subject_ids = subjects.keys()
        np.random.shuffle(subject_ids)
        testing_ids = subject_ids[:n_test_subjects]
        pickle_dump(testing_ids, testing_ids_file)

    batch = []
    for subject_dir in subject_dirs:

        subject_id = get_subject_id(subject_dir)

        processed_list.append("Flair.nii.gz")
        processed_list.append("T1c.nii.gz")
        processed_list.append("truth.nii.gz")

        batch.append(read_subject_folder('/home/s/myproject/3DUnetCNN-master/data/train'))
        if len(batch) >= batch_size:
            train_batch(np.array(batch), model)
            del(batch)
            batch = []
            print("Saving: " + model_file)
            pickle_dump(processed_list, processed_list_file)
            model.save(model_file)

    if batch:
        train_batch(np.array(batch), model)
        del(batch)
        print("Saving: " + model_file)
        pickle_dump(processed_list, processed_list_file)
        model.save(model_file)

if name == "main": main(overwrite=False)

turtleizzy commented 7 years ago

First of all, you might want to use "insert code" button on the top of the comment box or simply enclose your code with `s so as to ensure that your code can be displayed in appropriate format (i.e. in intended indentation since indention *matters*). Secondly, an 'out-of-memory' problem without any tracebacks or environmental information is almost unsolvable. Is it a TensorflowResourceExhaustedError` indicating a GPU out-of-memory problem? Or a RAM out-of-memory problem. How much GPU memory and system memory was available when this issue happened? At what scale, what stage and exactly which line this error took place? Thirdly, in my humble opinion, debugging should really be the business of your own.

Advices: If you are short of memory, your options are either to increase your memory or reduce your need of memory. The latter one is much easier but not always applicable. Try reducing batch size, decreasing the shape of image, or fine-tuning less layers.

mjiansun commented 7 years ago

Thank you very much for your advice! @turtleizzy