(3D) cascade for whole slide images

thijsgelton commented 2 years ago

Hi Fabian,

Let me start off by thanking you and your group for this amazing piece of engineering. It is truly mind-blowing to see the perfect segmentations. I have been using it for some time now on a computational pathology (CP) problem, which requires semantic segmentation. This is not the initial use case of nnUNet, but it seemed to work pretty well! However, with CP it is often the case that due to the limited patch size and the high resolution, that more context is required for the network to make a distinction between certain classes. Therefore, I was thinking of adapting nnUNet such that it got extra contextual information. However, 3D cascade sort of has this mechanism of context build into it as well, but is tailored to the 3D space, and I was therefore wondering:

What are your thoughts on using cascade for context aggregation in 2D images?
Have you tried 3D cascade on 2D data, and what are your thoughts on this?
Which steps should I take to make 3D cascade applicable to 2D data?

The alternative for context in nnUNet is to add an encoder branch that takes as input a high spacing (2 mpp, 4mpp, etc.) and merges this as the bottleneck and is guided by a classification loss on slide-level (i.e. diagnosis of the patient). This would also require some adjustments to the framework, like accounting for the added VRAM by the encoder when computing the depth of the network and the patch size.

Thank you again for the framework and all the support thereafter here on github.

FabianIsensee commented 2 years ago

Hi there, it's an interesting use case we haven't considered so far. Adding a 2D cascade is certainly feasible and would be in the spirit of nnU-Net. Do you have any sample data (publicly available dataset) that I could use for development?

Running the 3d cascade on 2D data will not work. I recommend against it.

With nnU-Net being a general purpose segmentation framework I am not too enthusiastic about implementing what appears to be very problem-specific functionality such as multi-resolution inputs. That sounds rather complex with many things that could go wrong.

Best, Fabian

thijsgelton commented 2 years ago

Hi Fabian,

Sorry for the late reply.

Do you have any sample data (publicly available dataset) that I could use for development?

You could take a look at the PAIP2019 dataset. You will need to sign up for it to get the data, but I believe they always accept.

Currently working on concatenating a higher spacing patch as context at the bottleneck of nnUNet and running the experiments to see if it yields good results. One of the reasons we wanted to do this is because we are experiencing a phenomenon that looks like a sort of collapse, where the network gets approximately the same type of cell tissue and fully guesses another class. To illustrate, I have added two images below:

Using a third-party library I run inference with nnUNet on an entire digital pathology whole slide image (5gb + ), using a sliding window approach and stitching the result together. So beware that this isn't the internal sliding window feature of nnUNet, purely because it would not fit into memory. I do use the same preprocessing steps and also the gaussian weighting on a single patch, together with using a 5-fold ensemble. The images show how nnUnet switches from very certain of all the tissue belonging to 1 class and then another step next to it doubles down on another class. We were wondering if this could be due to the fact that nnUNet is set to overfit on the training data (no early stopping etc.) or if you know why this could happen.

Thank you for all your efforts.

FabianIsensee commented 2 years ago

Hey, that looks very strange. I am always a bit weary of people writing their own inference scripts because more often than not things are broken in the process. Why not make minimally invasive changes to the nnU-Net inferencing pipeline? Changing the arrays where the outputs are stored to memmaps should already fix the out of memory issues (as long as the preprocessing is still successful). Regardless, can you remind me of what classes the different colors are supposed so be? That would make it easier for me to understand what's going on. Have you tried running inference on a crop around this area with the default nnU-Net pipeline? Maybe that doesn't produce this behavior? Best, Fabian

thijsgelton commented 2 years ago

Hi Fabian,

Thanks for the quick reply. I understand that it is hard to figure out what the problem could be if people are using other's code as well. I will try running the nnUNet inference pipeline on the full image, but the problem is that getting a histopathology TIFF file at 0.5 spacing to NIFTI format is, as far as I know, impossible due to the enormous size. I could increase the input patch size by a lot, when using the 3rd party lib, to decrease the number of tiles and thus the number of times this 'switching' phenomenon can occur.

To answer your question about whether I could run it on a crop around the region in the image: basically what you already see is that nnUNet with sliding window on those patches in the segmentation map shows that within the input, there is no switching (so that's awesome). The thing we are most interested in is what makes it that nnUNet does not show "doubt" within a single patch? The output in the image is not using any postprocessing. Is it maybe the fact that nnUNet is more or less set to overfit on the data? To exemplify: the purple segmentations on the left do show a bit of "doubt" as it is sometimes mixed with pink, but this really only happens at the border of the tissue, which can for now just be seen as some artifact. In the center of the tissue you would never see this, it would always be this perfect segmentation.

The colours are as follows:

Pink: hyperplasia without atypia Purple: hyperplasia with atypia Red: carcinoma Blue: stroma Green: rest

The pink, purple and red are classes that are on a physiological spectrum of increasing severity. The difference pathologically between purple and red is the amount of stroma (ratio). So, in this case, observing by eye, it isn't too strange that it switches classes, but the fact that it does so perfectly still stumbles us.

Thank you for your time and help.

FabianIsensee commented 2 years ago

Hey thanks for all the details. What exactly is the problem with converting such a large histopathology image to nifti? It should work, but will of course be slow. Apart from that it is really hard to say what might be going on. Especially because the classes seem to be on a linear scale of severity with fluid borders in between. Whan can cause the behavior of entire patches being pink or red could be the use of instance normalization. You could try the nnUNetTrainerV2_BN and see if that improves things. Best, Fabian

thijsgelton commented 2 years ago

Hi,

Nifti only allows for a dimension of 32k by 32k pixels and the images I use often exceed 100k by 100k. Thank you though for the tip about batch normalization, I will try this out and report back once I get the results.

MIC-DKFZ / nnUNet

(3D) cascade for whole slide images #1056