frankkramer-lab / MIScnn

A framework for Medical Image Segmentation with Convolutional Neural Networks and Deep Learning
GNU General Public License v3.0
402 stars 116 forks source link

Questions on preprocessing parameters #70

Closed dianemarquette closed 3 years ago

dianemarquette commented 3 years ago

Hi!

Thank you so much for developing the MIScnn framework. The clean code and detailed documentation are very much appreciated :).

I'm not sure if I'm posting this question at the right location. I already posted a comment in issue #49 10 days ago (because it's directly linked to @muellerdo's answer). But as the issue has been closed for a long time, I think my question will be more visible if I create a new thread. I hope I didn't make a mess 😅.

I'm currently working on a neural network to segment CT scans of hands. I plan on using the KiTS19 example notebook as a basis. However, I'm not sure which preprocessing parameters I should choose.

From what I understood after going through MIScnn's documentation and discussions, to find the best voxel spacing, I must first choose a suitable patch size.

I increase my patch size to the maximum according to the GPU VRAM (160x160x80 for U-Net standard with 16GB VRAM)

What formula / rule do you use to infer your patch size from your GPU VRAM? In my case, I have an Nvidia RTX 2080 Super (with 8GB GDDR6 as VRAM) and I want to segment the trapezium bone from hand CT scans (original data = 512 x 512 with slices from 378 to 800) using U-Net. What would you recommend?

In addition, how do you find the correct patch overlap size? Is it only through experimentation? Is there any rule of thumb I could use?

Last question, how do you choose the batch size? I saw you used a size 2 in your examples and I know it depends on your GPU and model size, but how can I compute it precisely for my own dataset / model?

Thank you for sharing your expertise! Cheers,

Diane

muellerdo commented 3 years ago

Hey @dianemarquette,

sorry for the late response. Thank you for your kind words and interest in using MIScnn.

I'm not sure if I'm posting this question at the right location. I already posted a comment in issue #49 10 days ago (because it's directly linked to @muellerdo's answer). But as the issue has been closed for a long time, I think my question will be more visible if I create a new thread. I hope I didn't make a mess sweat_smile.

No problem. Better always opening a new issue for better overview. :)

What formula / rule do you use to infer your patch size from your GPU VRAM? In my case, I have an Nvidia RTX 2080 Super (with 8GB GDDR6 as VRAM) and I want to segment the trapezium bone from hand CT scans (original data = 512 x 512 with slices from 378 to 800) using U-Net. What would you recommend?

I would recommend going as high as possible with your patch size & a fixed mini batch size of 2 or 4 for 3D volumes. With a standard U-Net and 8GB VRAM, I would start with 64^3 cuboids and if these are working going upwards until you get Out of Memory (OOM) exceptions thrown on to you.

In addition, how do you find the correct patch overlap size? Is it only through experimentation? Is there any rule of thumb I could use?

I'm a fan of 1/4 patch size, but at least more than 15 pixel on each axis. The key point is that you have multiple predictions for the patch edges due to these are less reliable in my experience.

A student of our lab (@Deathlymad) experimented a little bit with patch overlapping a few month ago, check this out: image image

I was using 1/2 patch overlapping before, but the experiments of @Deathlymad demonstrated that 1/4 is probably the best way to go.

Last question, how do you choose the batch size? I saw you used a size 2 in your examples and I know it depends on your GPU and model size, but how can I compute it precisely for my own dataset / model?

More is always better. But, as always, we are limited here as well on the available GPU VRAM. If you have 2D data, I would recommend going to at least 8 as absolute minimum, better would be starting with 16 samples per batch. For 3D data, it is possible to go quite low on batch size. The reason for this is simply because we already have quite a lot of data in a single sample due to the z-axis. I would recommend to use a batch size of 2 and utilize the highest possible resolution you can get before going OOM. However, if you have quite small images (which is NOT the case for your CT scans), then I would recommend increasing batch size.

Summarized you have to balance: Resolution(/Image Size) vs Number of Samples per Batch Both are just screws to obtain the maximum information for your model fitting. You want both to be as high as possible, but I would always recommend Resolution > Batch Size if you already hit your minimum batch thresholds (e.g. 8/16 for 2D and 2 for 3D).

dianemarquette commented 3 years ago

Hi @muellerdo !

Thank you so so much for taking the time to answer so thoroughly to all my questions. I really appreciate it ☺️.

I will follow your guidelines to tune my preprocessing parameters based on my dataset and the hardware at my disposal.

Thanks again !

Diane