AICAN-Research / FAST-Pathology

⚡ Open-source software for deep learning-based digital pathology
BSD 2-Clause "Simplified" License
121 stars 24 forks source link

Support for multi-label classification #68

Closed MarkusDrange closed 1 year ago

MarkusDrange commented 1 year ago

I am attempting to integrate a multi-label classification model in FAST Pathology, but experience some problems:

Ideally, it would be nice if all classes were listed as classes. Due to the model creating separate outputs, hence the need to create separate process objects and heatmaps, they now appear as layers. In addition, the colour attribute is not kept; they are all initialized as red.

This is the pipeline I am running:

PipelineDescription "Multi task patch-wise image classification trained from ADP dataset"
PipelineInputData WSI "Whole-slide image"
PipelineOutputData heatmap stitcher 0
PipelineOutputData heatmap1 stitcher1 0
PipelineOutputData heatmap2 stitcher2 0
PipelineOutputData heatmap3 stitcher3 0
PipelineOutputData heatmap4 stitcher4 0
PipelineOutputData heatmap5 stitcher5 0
PipelineOutputData heatmap6 stitcher6 0
PipelineOutputData heatmap7 stitcher7 0
PipelineOutputData heatmap8 stitcher8 0

Attribute classes "Tissue"

#"E;C;H;S;A;M;N;G;T"

### Processing chain

ProcessObject tissueSeg TissueSegmentation
Attribute threshold 85
Input 0 WSI

ProcessObject patch PatchGenerator
Attribute patch-size 256 256
Attribute patch-magnification 5
Attribute patch-overlap 0.0
Attribute mask-threshold 0.05
Input 0 WSI
Input 1 tissueSeg 0

ProcessObject network NeuralNetwork
Attribute inference-engine "OpenVINO"
#Attribute scale-factor 0.00392156862
Attribute model "$CURRENT_PATH$/../models/Model_prod.onnx"
Attribute dimension-ordering "channel-first"
Input 0 patch 0
ProcessObject stitcher PatchStitcher
Input 0 network 0
ProcessObject stitcher1 PatchStitcher
Input 0 network 1
ProcessObject stitcher2 PatchStitcher
Input 0 network 2
ProcessObject stitcher3 PatchStitcher
Input 0 network 3
ProcessObject stitcher4 PatchStitcher
Input 0 network 4
ProcessObject stitcher5 PatchStitcher
Input 0 network 5
ProcessObject stitcher6 PatchStitcher
Input 0 network 6
ProcessObject stitcher7 PatchStitcher
Input 0 network 7
ProcessObject stitcher8 PatchStitcher
Input 0 network 8

### Renderers
Renderer imgRenderer ImagePyramidRenderer
Input 0 WSI

Renderer heatmap HeatmapRenderer
Attribute interpolation false
#Attribute hidden-channels 0
#Attribute channel-colors "0" "green" "1" "green" "2" "magenta" "3" "red"
Input 0 stitcher 0

### Renderers
Renderer imgRenderer ImagePyramidRenderer
Input 0 WSI

Renderer heatmap HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "green"
Input 0 stitcher 0

Renderer heatmap1 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "blue"
Input 0 stitcher1 0

Renderer heatmap2 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "yellow"
Input 0 stitcher2 0

Renderer heatmap3 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "black"
Input 0 stitcher3 0

Renderer heatmap4 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "magenta"
Input 0 stitcher4 0

Renderer heatmap5 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "white"
Input 0 stitcher5 0

Renderer heatmap6 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "cyan"
Input 0 stitcher6 0

Renderer heatmap7 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "brown"
Input 0 stitcher7 0

Renderer heatmap8 HeatmapRenderer
Attribute interpolation false
Attribute channel-colors "0" "red"
Input 0 stitcher8 0
andreped commented 1 year ago

Hmm, I guess this happens because we are unable to set the Attribute classes "Tissue" Attribute at the top of the pipeline for each output, or maybe I am wrong, @smistad?

I believe the results of the individual outputs should be stored on disk, and you should access them in FP, as you have defined 9 separate PipelineOutputData heatmapX stitcher 0 where X in range {0,1,...,8}. But I guess colour information and whatnot will not be stored, and perhaps in the View widget the displayed class names will be the same for all, maybe?

smistad commented 1 year ago

Sounds like what you really need is a tensor concatenation process object, so that you can concatenate all the output tensors from the neural network into one big tensor. Then you would only need 1 patch stitcher, 1 heatmap renderer, and have only 1 pipeline output data, and each channel will them be separate classes.

We could add this to FAST, but it is also very easy for you to just add this to your model. Take your trained model, and a append a concatenation layer at the end, so you end up with only one output. I have done this before with keras, so let me know if you need help to do it.

This will offcourse only work if all output tensors have the same size. Is that the case?

andreped commented 1 year ago

Sounds like what you really need is a tensor concatenation process object

In multilabel classification each image patch can be linked to more than one class. In this use case, that is actually more often the case, hence, why we need one output per class. It is different from regular multiclass. We do not want to argmax to only get one predicted class per image.

We actually split the k output neurons into separate output branches and add a sigmoid to each, which we then can use to generate heatmaps for individual tissue types in FP. Metric-wise this seems to work just fine, but I believe @MarkusDrange need to assess how it works on a full WSI first, which is where FP comes in.

For instance, one output could be all tissue, another could be all stroma, and the last output could be a specific subclass of stroma or some tubular formation. In the general case there will be even more overlaps.

I believe the main problem is that FP do not handle multiple output networks appropriately. I believe I have seen the same with multiple instance learning models before. I was forced to only study one output at a time to get a useful rendering experience.

smistad commented 1 year ago

I see, multi-label classification is not a use case we have tried with FAST and FP yet. The PatchStitcher and HeatmapRenderer will probably need some modifications for it to work.

But from what I understand this pipeline works, it is just inconvenient to have multiple layers, and the colors are not stored properly?

andreped commented 1 year ago

I see, multi-label classification is not a use case we have tried with FAST and FP yet. The PatchStitcher and HeatmapRenderer will probably need some modifications for it to work.

I think we can regard multi-label classification as just another multi-output model. At least that is how it is handled for training and inference in PyTorch. But of course, for 9 classes, you will need to make 9 seperate outputs, with 9 seperate stitchers, and 9 separate renderers in the FPL, but I don't see that as a major problem. Not sure how one could solve this otherwise. It is relevant to store segmentation heatmaps for each class separately.

But from what I understand this pipeline works, it is just inconvenient to have multiple layers, and the colors are not stored properly?

The pipeline seems to work, yes. As far as I understand it, the problems are: 1) It is not possible to set class names for individual outputs, hence class names are not displayed properly in the View widget. 2) All outputs are given the same colour (at least after loading the results again).

Might also be that after running the pipeline, only one result is rendered, which only shows after restarting FP and loading the Project. @MarkusDrange Any comments on this? @MarkusDrange could share you the model he has, if you wish to debug this yourself, @smistad.

MarkusDrange commented 1 year ago

Works as intended now with latest artifact in ubuntu 20.04. Thanks!

andreped commented 1 year ago

Great! All credit to @smistad!

Then I believe this issue can be closed. A new release of FP will likely be published in the next upcoming weeks. Stay tuned!