ellisdg / 3DUnetCNN

Pytorch 3D U-Net Convolution Neural Network (CNN) designed for medical image segmentation
MIT License
1.9k stars 653 forks source link

Using own 3d datasets #95

Closed SnowRipple closed 3 years ago

SnowRipple commented 6 years ago

Hi David @ellisdg! First of all, let me congratulate you on this great project! Many Thanks for making it public! I am trying to make it work with my own 3d non-medical dataset. The shape of each 2d slice is (308,800) and I created 32 mini-grayscale volumes each consisting of 12 2d slices. The whole dataset has shape (32,12,308,800). I am trying to train on whole image for now and it's a segmentation problem so the labels have exactly the same shape.

Unfortunately when I change the "image_shape" and "labels" config to (12,308,800) (shape of a single volume) and use batch size of 2 I get error:

Traceback (most recent call last): File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 206, in main(overwrite=config["overwrite"]) File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 201, in main n_epochs=config["n_epochs"]) File "/home/snowripple/workspace/3DUnetCNN/unet3d/training.py", line 88, in train_model early_stopping_patience=early_stopping_patience)) File "/home/snowripple/workspace/keras_2_8/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 2046, in fit_generator class_weight=class_weight) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1760, in train_on_batch check_batch_axis=True) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1378, in _standardize_user_data exception_prefix='input') File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 132, in _standardize_input_data str(array.shape)) ValueError: Error when checking input: expected input_1 to have 5 dimensions, but got array with shape (2, 12, 308, 800)

How the input can have 5 dimensions? What each dimension represents?

Regards, P

ellisdg commented 6 years ago

The input shape should be 5 dimensions: (m, n, x, y, z) x, y, z represent the image shape, as you would expect. n is the number of channels. In a standard color video image, you would have 3 channels (red, green, blue). In medical imaging these channels can be separate imaging modalities. m is the batch size or number of samples being passed to the model for training.

On Mar 27, 2018 11:21 AM, "Piotr Chudzik" notifications@github.com wrote:

Hi David! First of all, let me congratulate you on this great project! Many Thanks for making it public! I am trying to make it work with my own 3d non-medical dataset. The shape of each 2d slice is (308,800) and I created 32 mini-grayscale volumes each consisting of 12 2d slices. The whole dataset has shape (32,12,308,800). I am trying to train on whole image for now and it's a segmentation problem so the labels have exactly the same shape.

Unfortunately when I change the "image_shape" and "labels" config to (12,308,800) (shape of a single volume) and use batch size of 2 I get error:

Traceback (most recent call last): File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 206, in main(overwrite=config["overwrite"]) File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 201, in main n_epochs=config["n_epochs"]) File "/home/snowripple/workspace/3DUnetCNN/unet3d/training.py", line 88, in train_model early_stopping_patience=early_stopping_patience)) File "/home/snowripple/workspace/keras_2_8/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 2046, in fit_generator class_weight=class_weight) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1760, in train_on_batch check_batch_axis=True) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1378, in _standardize_user_data exception_prefix='input') File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 132, in _standardize_input_data str(array.shape)) ValueError: Error when checking input: expected input_1 to have 5 dimensions, but got array with shape (2, 12, 308, 800)

How the input can have 5 dimensions? What each dimension represents?

Regards, P

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ellisdg/3DUnetCNN/issues/95, or mute the thread https://github.com/notifications/unsubscribe-auth/AIKZkzKryP6lwJYGlXYpCQAwcbZfxdEEks5timb6gaJpZM4S9Owz .

ellisdg commented 6 years ago

If you just have one channel for each image you want to have n equal to 1. If your dataset is already in numpy, this can be can be done with np.newaxis:

5d_dataset = dataset[:, np.newaxis]

This should return a shape of (32, 1, 12, 308, 800). You can then save this as an HDF5 file and have the data generators iterate through the data.

On Tue, Mar 27, 2018 at 11:36 AM, David Ellis dgellis90@gmail.com wrote:

The input shape should be 5 dimensions: (m, n, x, y, z) x, y, z represent the image shape, as you would expect. n is the number of channels. In a standard color video image, you would have 3 channels (red, green, blue). In medical imaging these channels can be separate imaging modalities. m is the batch size or number of samples being passed to the model for training.

On Mar 27, 2018 11:21 AM, "Piotr Chudzik" notifications@github.com wrote:

Hi David! First of all, let me congratulate you on this great project! Many Thanks for making it public! I am trying to make it work with my own 3d non-medical dataset. The shape of each 2d slice is (308,800) and I created 32 mini-grayscale volumes each consisting of 12 2d slices. The whole dataset has shape (32,12,308,800). I am trying to train on whole image for now and it's a segmentation problem so the labels have exactly the same shape.

Unfortunately when I change the "image_shape" and "labels" config to (12,308,800) (shape of a single volume) and use batch size of 2 I get error:

Traceback (most recent call last): File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 206, in main(overwrite=config["overwrite"]) File "/home/snowripple/workspace/3DUnetCNN/brats/train_isensee2017.py", line 201, in main n_epochs=config["n_epochs"]) File "/home/snowripple/workspace/3DUnetCNN/unet3d/training.py", line 88, in train_model early_stopping_patience=early_stopping_patience)) File "/home/snowripple/workspace/keras_2_8/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 2046, in fit_generator class_weight=class_weight) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1760, in train_on_batch check_batch_axis=True) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 1378, in _standardize_user_data exception_prefix='input') File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 132, in _standardize_input_data str(array.shape)) ValueError: Error when checking input: expected input_1 to have 5 dimensions, but got array with shape (2, 12, 308, 800)

How the input can have 5 dimensions? What each dimension represents?

Regards, P

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ellisdg/3DUnetCNN/issues/95, or mute the thread https://github.com/notifications/unsubscribe-auth/AIKZkzKryP6lwJYGlXYpCQAwcbZfxdEEks5timb6gaJpZM4S9Owz .

SnowRipple commented 6 years ago

Thanks @ellisdg for your help! I will do it as you suggested! I was confused with nr_channels because I never worked with 3d medical images and 4 channels seemed weird.. I generated my data using the format you provided (32, 1, 12, 308, 800) however your code requires only one data file which (I assume) must contain both data and labels. What should be the shape of this data ("brats_data.h5") so it includes both data and labels. Will something like (2,32, 1, 12, 308, 800) be ok, where first dimension [0] is data and [1] are labels?

Many Thanks!

ellisdg commented 6 years ago

I forgot this earlier, but the code is setup to assume the last channel in the data file is the labeled image. So your data file should be of shape (32, 2, 12, 308, 800). Where channel number 2 is the labeled image.

On Tue, Mar 27, 2018 at 1:28 PM, Piotr Chudzik notifications@github.com wrote:

Thanks @ellisdg https://github.com/ellisdg for your help! I will do it as you suggested! I was confused with nr_channels because I never worked with 3d medical images and 4 channels seemed weird.. I generated my data using the format you provided (32, 1, 12, 308, 800) however your code requires only one data file which (I assume) must contain both data and labels. What should be the shape of this data ("brats_data.h5") so it includes both data and labels. Will something like (2,32, 1, 12, 308, 800) be ok, where first dimension [0] is data and [1] are labels?

Many Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ellisdg/3DUnetCNN/issues/95#issuecomment-376627693, or mute the thread https://github.com/notifications/unsubscribe-auth/AIKZk9cEJBdhVHIcwyGhAetvXfUiHG-9ks5tioS8gaJpZM4S9Owz .

SnowRipple commented 6 years ago

So I regenerated the input data to (32, 2, 12, 256, 512) (resized images for speed) with input being first channel and output being last and changed config to:

config["pool_size"] = (2, 2, 2) # pool size for the max pooling operations

config["image_shape"] = (144, 144, 144) # This determines what shape the images will be cropped/resampled to.

config["image_shape"] = (12, 256, 512) # This determines what shape the images will be cropped/resampled to.

config["patch_shape"] = (64, 64, 64) # switch to None to train on the whole image

config["patch_shape"] = None # switch to None to train on the whole image

config["labels"] = (1, 2, 4) # the label numbers on the input image

config["labels"] = (12, 256, 512) # the label numbers on the input image

with batch_size = 1 but I am getting an error:

Epoch 1/500 Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 754, in run self.target(*self.args, **self.kwargs) File "/home/snowripple/workspace/keras_2_8/keras/utils/data_utils.py", line 569, in data_generator_task generator_output = next(self._generator) File "/home/snowripple/workspace/3DUnetCNN/unet3d/generator.py", line 155, in data_generator skip_blank=skip_blank, permute=permute) File "/home/snowripple/workspace/3DUnetCNN/unet3d/generator.py", line 210, in add_data data, truth = get_data_from_file(data_file, index, patch_shape=patch_shape) File "/home/snowripple/workspace/3DUnetCNN/unet3d/generator.py", line 238, in get_data_from_file x, y = data_file.root.data[index], data_file.root.truth[index, 0] File "/usr/local/lib/python2.7/dist-packages/tables/group.py", line 818, in getattr__ return self._f_get_child(name) File "/usr/local/lib/python2.7/dist-packages/tables/group.py", line 698, in _f_get_child self._g_check_has_child(childname) File "/usr/local/lib/python2.7/dist-packages/tables/group.py", line 395, in _g_check_has_child % (self._v_pathname, name)) NoSuchNodeError: group / does not have a child named truth

Traceback (most recent call last): File "/home/snowripple/workspace/3DUnetCNN/brats/train.py", line 119, in main(overwrite=config["overwrite"]) File "/home/snowripple/workspace/3DUnetCNN/brats/train.py", line 114, in main n_epochs=config["n_epochs"]) File "/home/snowripple/workspace/3DUnetCNN/unet3d/training.py", line 88, in train_model early_stopping_patience=early_stopping_patience)) File "/home/snowripple/workspace/keras_2_8/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/home/snowripple/workspace/keras_2_8/keras/engine/training.py", line 2015, in fit_generator generator_output = next(output_generator) StopIteration Closing remaining open files:/home/snowripple/workspace/3DUnetCNN/brats/near_inline_data.h5...done Which is generated by this code:

while epoch < epochs: callbacks.on_epoch_begin(epoch) steps_done = 0 batch_index = 0 while steps_done < steps_per_epoch: generator_output = next(output_generator)

For testing purposes I just used 4 mini-volumes (not 32) and the validation split returned:

Creating validation split... ('Number of training steps: ', 3) ('Number of validation steps: ', 1)

SnowRipple commented 6 years ago

It seems that line generator_output = next(output_generator) in keras fit_generator method cannot find output generator..I tried older version of keras (currently using 2.8) but no success. Training with brats data works though. @ellisdg did you tried it with dataset other than brats/medical?

ellisdg commented 6 years ago

I've used this code with multiple medical 3D datasets and it has worked well. Have you been able to fix what was going wrong? If your error is still the same as above: NoSuchNodeError: group / does not have a child named truth there is a problem with the hdf5 file still.

huazai-1994 commented 6 years ago

@SnowRipple @ellisdg hello,have you solved the problem you disscuss above? I am going to use this code to train on the other 3D dataset,so i want to consult you that how to make the standary trainning dataset with the origingal dataset which has the format of ".raw"(both of the data and the label)?could you please share me some code? I have never done with 3D dataset .so I really need your help.Thanks a lot

chenxiaodanhit commented 5 years ago

@ellisdg @SnowRipple @huazai-1994 hello, recently i have researched 3-D semantic segmentation. And my raw datasets are 2D slices, could you give me some sugesstions to run this model? Thank you very much!

ljljlj02 commented 4 years ago

I forgot this earlier, but the code is setup to assume the last channel in the data file is the labeled image. So your data file should be of shape (32, 2, 12, 308, 800). Where channel number 2 is the labeled image. On Tue, Mar 27, 2018 at 1:28 PM, Piotr Chudzik @.***> wrote: Thanks @ellisdg https://github.com/ellisdg for your help! I will do it as you suggested! I was confused with nr_channels because I never worked with 3d medical images and 4 channels seemed weird.. I generated my data using the format you provided (32, 1, 12, 308, 800) however your code requires only one data file which (I assume) must contain both data and labels. What should be the shape of this data ("brats_data.h5") so it includes both data and labels. Will something like (2,32, 1, 12, 308, 800) be ok, where first dimension [0] is data and [1] are labels? Many Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#95 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AIKZk9cEJBdhVHIcwyGhAetvXfUiHG-9ks5tioS8gaJpZM4S9Owz .

How to deal with multi-channel medical image?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If you have questions, feel free to join the Slack group or email me at davidgellis2@gmail.com.