Bayer-Group / tiffslide

TiffSlide - cloud native openslide-python replacement based on tifffile
Other
80 stars 12 forks source link

garbled tiles from CAMELYON16 dataset using tiffslide but not openslide #86

Open kaczmarj opened 1 month ago

kaczmarj commented 1 month ago

i have come across unexpected behavior when reading a tile from the CAMELYON16 dataset, slide test_019.tif. please see code to reproduce below.

download image test_019.tif using:

# python -m pip install awscli
aws s3 cp --no-sign-request s3://camelyon-dataset/CAMELYON16/images/test_019.tif .

Tile read using tiffslide

import tiffslide

x, y = 166888, 50248
s = 566
tslide = tiffslide.TiffSlide("test_019.tif")
tslide.read_region((x, y), level=0, size=(s, s))

image

Tile read using openslide

import openslide

x, y = 166888, 50248
s = 566
oslide = openslide.OpenSlide("test_019.tif")
oslide.read_region((x, y), level=0, size=(s, s))

image

takeaway

as you can see, the tile read using tiffslide is different from that using openslide. i wonder if jpeg2000 compression is an issue here?

kaczmarj commented 1 month ago

i suspect the jpeg2000 compression has something to do with this, because i cannot reproduce in images that have other compression schemes.

the camelyon16 dataset uses jpeg2000.

kaczmarj commented 1 month ago

the difference is much more pronounced in test_034

# Download the image
aws s3 cp --no-sign-request s3://camelyon-dataset/CAMELYON16/images/test_034.tif .
x, y = 56606, 43386
s = 566

tslide = tiffslide.TiffSlide("test_034.tif")
oslide = openslide.OpenSlide("test_034.tif")

tslide.read_region((x, y), level=0, size=(s, s))
oslide.read_region((x, y), level=0, size=(s, s))

TiffSlide

image

Openslide

image

ap-- commented 1 month ago

Interesting... Seems there is something different about the lowest level in the tif file.

This is broken in tifffile<=2024.4.24 and is fixed with tifffile>=2024.5.3

https://github.com/cgohlke/tifffile/blob/7230baf84200203c7d3f50f46324093cd71013fb/CHANGES.rst?plain=1#L32-L35

import tiffslide
import PIL.Image

s = tiffslide.open_slide("./test_034.tif")

PIL.Image.fromarray(s.zarr_group[1][::32,::32,:]).save("out1.png")
PIL.Image.fromarray(s.zarr_group[0][::64,::64,:]).save("out0.png")

level=1

out1

level=0 (tifffile<=2024.4.24)

out0

level=0 (tifffile>=2024.5.3)

out0c

kaczmarj commented 1 month ago

that's great! thanks for this. i'm glad the fix is already out. feel free to close the issue if there is nothing else to do.

ap-- commented 1 month ago

Until tiffslide requires a newer tifffile version we'll keep this open so that people find it more easily.

ap-- commented 1 month ago

And thank you for reporting, and providing a reproducible example ❤️