girder / large_image

Python modules to work with large multiresolution images.
http://girder.github.io/large_image/
Apache License 2.0
192 stars 42 forks source link

incorrect tile size from tileiterator: requested tile_size not explicitly respected #440

Closed YoniSchirris closed 3 years ago

YoniSchirris commented 4 years ago

Problem summary

When specifying a specific mm_x and tile_size in the tileIterator, there's some up- and down-scaling happening leading to floating-point numbers, which are rounded in intermediate steps. I assume that there's a possibility for inconsistency here.

Current result

With two different WSIs with different different mm_x at the same magnification level (0.001002 and 0.001008 at tile_magnification=10), and defining a tileiterator as

tile_iterator = li_slide.tileIterator(scale={'mm_x': 0.0012, 'mm_y': 0.0012},
                                              region=dict(left=0, top=0, right=region_width, bottom=region_height),
                                              tile_size=dict(width=224, height=224),
                                              tile_overlap=dict(x=0, y=0),
                                              format=large_image.tilesource.TILE_FORMAT_NUMPY,
                                              resample=PIL.Image.BICUBIC)

I get two different tile sizes, 224 (correct) and 223 (incorrect). Specifically, the incorrect WSI is TCGA-AG-A015-01A-02-BS2.ff3f2bc9-361a-4dd7-94da-f6950041fd2d.svs (can be downloaded from GDC data portal or requested from me)

Expected result

Whatever the source micron per pixel, whatever the requested micron per pixel, whatever the requested tile size, the resulting tiles should always be the tile size requested.

Proposed solution

I can't follow the exact flow of the iterator, but in large_image/tilesource/base.py there's e.g.

if resample is not False:
    tile_size['width'] = max(1, int(round(tile_size['width'] * requestedScale)))
    tile_size['height'] = max(1, int(round(tile_size['height'] * requestedScale)))
    tile_overlap['x'] = int(round(tile_overlap['x'] * requestedScale))
    tile_overlap['y'] = int(round(tile_overlap['y'] * requestedScale))

And at a later point the output tile size is again calculated by dividing through the requestedScale instead of resizing to the actually provided input size.

  # Add provisional width and height
            if self.resample not in (False, None) and self.requestedScale:
                self['width'] = max(1, int(
                    self['tile_width'] / self.requestedScale))
                self['height'] = max(1, int(
                    self['tile_height'] / self.requestedScale))
                if self.get('tile_magnification', None):
                    self['magnification'] = self['tile_magnification'] / self.requestedScale

where int() rounds down.

Changing round in the first code snippet to math.ceil (for tile_size! I haven't checked tile_overlap) works for my specific case. I've tried a wide variety of options (different tile_size, different mm_x) where math.ceil also gives the correct results while round doesn't. (but possibly for different settings and images this again wouldn't work)

But personally, I would explicitly call the requested tile_size if given by the user to ensure you always return the correct size, or at least throw an assertion error.

jonasteuwen commented 3 years ago

@manthey It appears to me @YoniSchirris is correct in his analysis? Likely the same needs to be done for the overlap though. Would you accept a PR? Is there also a test we can implement?