SBU-BMI / wsinfer

🔥 🚀 Blazingly fast pipeline for patch-based classification in whole slide images
https://wsinfer.readthedocs.io
Apache License 2.0
55 stars 9 forks source link

[ENH] add strided tiles #202

Closed kaczmarj closed 4 months ago

kaczmarj commented 8 months ago

add inference on overlapping patches. this can smooth outputs and could be more informative.

we will need to resolve how to save these outputs. at the moment, each row of the output CSVs holds one patch. we could simply include all of the patches, despite overlapping. or we could do some post-processing of the outputs, where we merge patch outputs and recover non-overlapping patches. i'm not sure yet.

i would appreciate any thoughts on how the outputs should be structured

cc: @xalexalex

related to https://github.com/qupath/qupath-extension-wsinfer/issues/46

xalexalex commented 8 months ago

I think the most transparent way would be to simply leave all (overlapping) tiles in the csv and let the user (or consumer, e.g. the qupath extension) decide how to handle them. Doing things like averaging could work for simple binary models, but it might not do what the user expects in multiclass classification tasks and in edge cases (such as edge tiles :) ).

However I might be mistaken and averaging might be the most convenient choice. I'm interested in other opinions and will be following this thread. Thanks for the ping!

p.s. doing inference on strided tiles might open up room for performance improvements (model architecture-dependent). But it's best if we discuss these later.

kaczmarj commented 5 months ago

the step size in these lines would have to be updated to support strided patches. i think we could add an argument to this function like overlap where we specify the ratio of overlaps. default would be zero, meaning no overlap. 0.2 would be 20% overlap, and 0.5 would be 50% overlap.

in the case of 0 overlap (default), the step size would be patch_size_pixels. with 0.2 overlap, the step size would be (1-0.2)*patch_size_pixels ==0.8*patch_size_pixels`.

the overlap variable would have to bubble up the call stack, up to the command line interface.

https://github.com/SBU-BMI/wsinfer/blob/0b1208c78ff1c8ba5e83332a7401d9ab7e6e3062/wsinfer/patchlib/patch.py#L157-L164

kaczmarj commented 4 months ago

@xalexalex - strided patches are implemented now! please see #218 for examples of output.

the main change is the addition of the --patch-overlap-ratio option to wsinfer run. that values controls the level of overlap and can can be in (-inf, 1). the default is 0, which indicates no overlap. a value of 0.1 means 10% overlap.

in the case of 10% overlap... let's say the patches are 100 um by 100 um. 10% overlap in this case means that each patch would overlap with the next patch by 10 um.

a negative value for --patch-overlap-ratio gives strides with holes. a value of -1 skips every other patch, for example.

here's an example of a command line call that skips every other patch.

wsinfer run -m breast-tumor-resnet34.tcga-brca -i slides/ -o outputs-overlap-minus1/ --patch-overlap-ratio -1

please let me know what you think!