DIAGNijmegen / pathology-whole-slide-data

A package for working with whole-slide data including a fast batch iterator that can be used to train deep learning models.
https://diagnijmegen.github.io/pathology-whole-slide-data/
Apache License 2.0
86 stars 24 forks source link

Strict sampling with `point_sampler` fails when using data with different level 0 spacings #31

Closed leandervaneekelen closed 1 year ago

leandervaneekelen commented 1 year ago

The PointSampler classes have a 'buffer' argument that allows you specify a margin around the border of your sampling annotations that you shouldn't sample from. If you set buffer equals to half the width/height of the shape of your patches, you effectively enforce 'strict sampling', i.e. only sample patches inside the sampling annotation and nothing outside.

However, the buffer argument is based on the level 0 spacing of whatever image you're sampling. If you have images that have a different level 0 spacing (practical example: I have images scanned at 0.25 um/px and 0.5 um/px and I want to sample at 0.5 um/px), this buffer argument doesn't work.

Proposed solution: make the PointSampler subclasses that use the 'buffer' argument aware of the spacings of the images and convert 'buffer' to level 0 of the image with the smallest spacing. I'm not quite sure whether or not this works though..

leandervaneekelen commented 1 year ago

Proposal from @martvanrijthoven: turn 'buffer' parameter of PointSamplers into a dict/named tuple that contains the spacing. Using this spacing, we can calculate the necessary downsample factor for the buffer.

We might have to add an edge case for when WSIS are not open yet (I'll look at this in more detail).

martvanrijthoven commented 1 year ago

This is now implemented in the main branch and will be available in v0.1.0