Trying execute miscnn on google colab

Itzikwa commented 4 years ago

Hi Dr Muller,

I tried to run the Kits19 example on google colab (i used TPU) and it got stuck for a long time on the cross validation performing cell (the same happened by using only 10 samples).

what do you think can be the problem?

Best regards, Itzik Walters

Itzikwa commented 4 years ago

Now i see that when i used 10 samples, i finally got this error message:

muellerdo commented 4 years ago

Hey @Itzikwa,

I tried to run the Kits19 example on google colab (i used TPU) and it got stuck for a long time on the cross validation performing cell

The cross validation cell is the runner of the complete pipeline. All other classes are just initialized. The fitting of a model for medical image segmentation depends mainly on data set size, architecture, gpu hardware and hyperparameters.

The KiTS19 example took around 2 days for computing on a NVIDIA QUADRO RTX 6000.

The problem now is that, as far as I have in mind, a free Google Colab can max. run for 12 hours.

Now i see that when i used 10 samples, i finally got this error message:

Mhm. Aside from the runtime, we should definitely get a medical image segmentation pipeline running for 10 samples.

Would it be possible that you can share the Colab itself or the full script & error log?

I'm optimistic that we get your MIS to run :)

Cheers, Dominik

Itzikwa commented 4 years ago

Hi, first of all, thank you!

The cross validation cell is the runner of the complete pipeline. All other classes are just initialized. The fitting of a model for medical image segmentation depends mainly on data set size, architecture, gpu hardware and hyperparameters.

I actually know that (although i didn't realize that it takes such a long time...).

i'm attaching here a shareable link to the google colab script. Note that there aren't big changes from the original notebook.

https://drive.google.com/file/d/1RjAryoDI_tKtUN6AFTeK0SUWKlhoDHCR/view?usp=sharing

muellerdo commented 4 years ago

No problem! :)

Can you try changing the ReduceLROnPlateau Callback from the original Keras library to the Tensorflow.Keras library:

Original line:

from keras.callbacks import ReduceLROnPlateau

Replace with this:

from tensorflow.keras.callbacks import ReduceLROnPlateau

The problem here is that MIScnn updated to TensorFlow 2.X about 2 month ago. In TensorFlow 2.X, they integrated Keras as High-level API. Therefore, instead of using the original Keras library, I have switched MIScnn to the Keras integrated in TensorFlow for more compact and robust dependencies.

Sadly the Keras and TF Keras libraries are not that compatible. It is not possible to add a Callback from the original Keras library to a TF Keras model.

Long story short: It should work if you update the import line.

I already updated old KiTS19 Jupyter Notebook example in the dev branch and will merge it to the master branch at the end of this week.

Sorry for the confusions.

Itzikwa commented 4 years ago

Hi, I thank you! I think you right and that was the problem. But after running I got this message: "Your session crashed after using all available RAM". This means that although I used only a few samples, that ram was out of memory...

muellerdo commented 4 years ago

The number of samples in the data set don't have an influence on the required GPU VRAM. Important features for VRAM consumption are: Image shape (patch shape), architecture, batch size, ...

I personally work with a NVIDIA QUADRO RTX 6000 in our lab and don't have that much experience with Google Colab.

This stackoverflow thread says that you have around 11GB VRAM in a Colab. https://stackoverflow.com/questions/48750199/google-colaboratory-misleading-information-about-its-gpu-only-5-ram-available

We have now several options for reducing the model complexity.

1. Reduce patch shape

The size of the patch is the main factor which influences the model complexity if we insist on using the 3D U-Net architecture. You can have a try a patch shape of (40x80x80) and see how well your model performs. Depending on your data set, you can additionally adjust the resampling in order to boost the performance for this patch shape, again. I would recommend to aim for a patch size of 1/8 of the median volume size after resampling. (e.g. for a volume size 512x512x512 a patch shape of 256x256x256)

pp = Preprocessor(data_io, data_aug=data_aug, batch_size=2, subfunctions=subfunctions,
                              prepare_subfunctions=True, prepare_batches=False, 
                              analysis="patchwise-crop", patch_shape=(40, 80, 80))

2. Turn off batch normalization

The normal 3D U-Net takes around 8GB VRAM, with batch normalization 16GB. Turning off batch normalization of the architecture, will reduce the required VRAM to half of it.

from miscnn.neural_network.architecture.unet.standard import Architecture
unet = Architecture(batch_normalization=False)

model = Neural_Network(preprocessor=pp, loss=tversky_loss, 
                       architecture=unet, metrics=[dice_soft, dice_crossentropy],
                       batch_queue_size=3, workers=3, learninig_rate=0.0001)

You can also try the plain U-Net which is a more simpler but equally powerful U-net standard variant.

from miscnn.neural_network.architecture.unet.plain import Architecture
unet = Architecture(batch_normalization=False)

model = Neural_Network(preprocessor=pp, loss=tversky_loss, 
                       architecture=unet, metrics=[dice_soft, dice_crossentropy],
                       batch_queue_size=3, workers=3, learninig_rate=0.0001)

3. Switch to 2D

Another option is to use the NIfTI_slicer IO interface and run a 2D analysis. With this approach, you can automatically split the 3D volumes into 2D slices and run a standard 2D U-Net on them. Also this will allow full-image analysis with full resolution, because a 2D HD image don't take that much VRAM compared to a 3D volume. It mostly depends on the data set, whether the utilization of the 3D information or a better resolution lead to the best performance.

# Initialize the NIfTI slicer I/O interface and configure the images as one channel
# (grayscale) and three segmentation classes (background, kidney, tumor)
from miscnn.data_loading.interfaces import NIFTIslicer_interface
interface = NIFTIslicer_interface(pattern="case_00[0-9]*", channels=1, classes=3)

muellerdo commented 4 years ago

I will close this issue due to inactivity.

If this issue was not solved, yet, please do not hesitate to open it up, again.

frankkramer-lab / MIScnn

Trying execute miscnn on google colab #25