DIAGNijmegen / pathology-whole-slide-data

A package for working with whole-slide data including a fast batch iterator that can be used to train deep learning models.
https://diagnijmegen.github.io/pathology-whole-slide-data/
Apache License 2.0
92 stars 27 forks source link

Sliding window with Segmentation Mask Sampling #23

Closed Vishwesh4 closed 2 years ago

Vishwesh4 commented 2 years ago

Hi Mart, Thanks for this wonderful package. I am facing an issue currently with integrating sliding window with the segmentation mask. I am currently using this yml configuration

wholeslidedata:
    default:
        seed: 35
        yaml_source: test.yml

        label_map:
            roi: 0
            invasive tumor: 1
            tumor-associated stroma: 2
            in-situ tumor: 3
            healthy glands: 4
            necrosis not in-situ: 5
            inflamed stroma: 6
            rest: 7
            lymphocytes and plasma cells: 8

        annotation_parser:
            sample_label_names: ['roi']

        point_sampler:
            attribute: CenterPointSampler

        patch_sampler:
            center: True

        annotation_sampler:
            attribute: OrderedAnnotationSampler

However, these are the points that are being sampled:

dx 0 x_shape (256, 256, 3) mask_shape (256, 256) POINT (71516 73734)
idx 1 x_shape (256, 256, 3) mask_shape (256, 256) POINT (90248 33833)
idx 2 x_shape (256, 256, 3) mask_shape (256, 256) POINT (11440 12248)
idx 3 x_shape (256, 256, 3) mask_shape (256, 256) POINT (22092 13028)
idx 4 x_shape (256, 256, 3) mask_shape (256, 256) POINT (35779 7162)
idx 5 x_shape (256, 256, 3) mask_shape (256, 256) POINT (50652 53262)

I tried following the slidingwindow.yml given for tiger, but it uses a different annotation parser which is not valid for my case.

Please do help me out. Thank you Vishwesh

martvanrijthoven commented 2 years ago

Dear Vishwesh,

I have created an example of how to use the sliding-window approach with masks here: https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/main/notebooks/tiger/sliding_window/SlidingWindow.ipynb

Please have a look at the config used in this example, which can be found here: https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/main/notebooks/tiger/sliding_window/slidingwindowconfig.yml

I have added comments to the config file, so I hope it is understandable.

Please let me know if you have questions about it.

Best wishes, Mart

martvanrijthoven commented 2 years ago

Dear Vishwesh,

To add to my previous message.

The sliding window approach can only be used with tissue masks/rois and the MaskAnnotationParser. Could you let me know why this is not valid in your case?

Best wishes, Mart

Vishwesh4 commented 2 years ago

Hi Mart, I am trying to use sliding window to extract patches with the segmentation labels in the rois and saving it for training purpose (with annotation parser and SegmentationPatchLabelSampler). Using random sampling is causing an issue of oversampling and is extracting almost the same patch. Please let me know if there is some workaround around this issue Thanks

martvanrijthoven commented 2 years ago

Dear Vishwesh,

This package currently does not support a sliding window approach to extract patches and segmentation labels from ROIs in the way you envision. You could make a point sampler that does this. You can subclass PointSampler and return points in a sliding window fashion per ROI. Maybe I will implement this in the future, but I am sorry to say that I don't have time to implement this right now.

However, the RandomPointSampler should not oversample and is random, so every point in an annotation has an equal probability of being sampled and to be used as a center point for a patch. A problem might be that some ROIs are smaller than others, and each ROI is equally sampled. To overcome this, you can use the AreaAnnotationSampler for the annotation sampler. This way, the area of each ROI is taken into account, and more often (based on the area), patches are sampled from ROIs with larger areas.

I hope I made myself understandable. In any case, with the current samplers available, I think oversampling should not be an issue and I am happy to help you further configure the batch iterator such that you can use it with the desired sampling strategy. Please let me know.

Best wishes, Mart

Vishwesh4 commented 2 years ago

Thanks a lot Mart!!!

martvanrijthoven commented 2 years ago

Dear Vishwesh,

Sliding window sampling is now implemented. You can use it like this:

parser = AsapAnnotationParser(hooks=(TiledAnnotationHook(tile_size=64, full_coverage=True),))

This will create annotations of 64x64 covering all the annotations in the dataset. Using the CenterPointSampler and OrderedAnnotationSampler you can sample, based on the annotations, in a sliding window. Please let me know if you have questions and if you need an example config file.

Best wishes, Mart

Vishwesh4 commented 2 years ago

Thanks a lot for implementing, Mart. I'll let you know in case.

michelbotros commented 2 years ago

@martvanrijthoven

Dear Vishwesh,

Sliding window sampling is now implemented. You can use it like this:

parser = AsapAnnotationParser(hooks=(TiledAnnotationHook(tile_size=64, full_coverage=True),))

This will create annotations of 64x64 covering all the annotations in the dataset. Using the CenterPointSampler and OrderedAnnotationSampler you can sample, based on the annotations, in a sliding window. Please let me know if you have questions and if you need an example config file.

Best wishes, Mart

I'd like to use this sliding window sampling. An example config file would be nice!

martvanrijthoven commented 2 years ago

Dear Michel,

I will share an example in this issue: #30

Best wishes, Mart