manzt / napari-lazy-openslide

Lazily load multiscale whole-slide images with openslide and dask
BSD 3-Clause "New" or "Revised" License
33 stars 6 forks source link

Different output in Windows vs WSL2 #18

Open choosehappy opened 3 weeks ago

choosehappy commented 3 weeks ago

Was hoping someone could quickly chime in with any thoughts on where to look.

very simple code, basically load a WSI with the OpenSlideStore and pull one of the pyramids out

import zarr
import dask.array as da
import napari

from napari_lazy_openslide import OpenSlideStore

wsi_fname='../10066_001.svs'

store = OpenSlideStore(wsi_fname)
grp = zarr.open(store, mode="r")
datasets = grp.attrs["multiscales"][0]["datasets"]
pyramid = [grp.get(d["path"]) for d in datasets]
print(pyramid)

pyramid = [da.from_zarr(store, component=d["path"]) for d in datasets]
print(pyramid)

aa=pyramid[2][:,:,::]

bb=aa.compute()
print(bb)

when i run it on Windows, it works as expected:

[<zarr.core.Array '/0' (30987, 48649, 4) uint8 read-only>, <zarr.core.Array '/1' (7746, 12162, 4) uint8 read-only>, <zarr.core.Array '/2' (1936, 3040, 4) uint8 read-only>]
[dask.array<from-zarr, shape=(30987, 48649, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>, dask.array<from-zarr, shape=(7746, 12162, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>, dask.array<from-zarr, shape=(1936, 3040, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>]
[[[243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]
  ...
  [242 243 243 255]
  [242 243 243 255]
  [242 243 243 255]]

 [[243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]
  ...
  [243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]]

 [[243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]
  ...
  [243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]]

 ...

 [[243 243 243 255]
  [243 243 244 255]
  [244 243 244 255]
  ...
  [243 243 243 255]
  [243 243 243 255]
  [243 243 243 255]]

 [[244 243 244 255]
  [244 243 244 255]
  [244 243 244 255]
  ...
  [243 243 243 255]
  [243 243 243 255]
  [243 243 244 255]]

 [[244 243 244 255]
  [244 243 244 255]
  [244 243 244 255]
  ...
  [243 243 243 255]
  [243 243 243 255]
  [242 242 243 255]]]

but when i run it on WSL2, the actual data returned is incorrect:

[<zarr.core.Array '/0' (30987, 48649, 4) uint8 read-only>, <zarr.core.Array '/1' (7746, 12162, 4) uint8 read-only>, <zarr.core.Array '/2' (1936, 3040, 4) uint8 read-only>]
[dask.array<from-zarr, shape=(30987, 48649, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>, dask.array<from-zarr, shape=(7746, 12162, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>, dask.array<from-zarr, shape=(1936, 3040, 4), dtype=uint8, chunksize=(512, 512, 4), chunktype=numpy.ndarray>]
[[[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 ...

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  ...
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]]

Any idea where i should start looking? Thanks!

choosehappy commented 3 weeks ago

by the way, this works as expected in both WSL2 and windows:

store._slide.read_region((0,0),2,(3040,1936))

so the image is readable, and correctly located, and openslide is all happy

choosehappy commented 3 weeks ago

problem appears in an even more simple use case, reducing the code down to:

import zarr
import dask.array as da
import napari

from napari_lazy_openslide import OpenSlideStore

wsi_fname='../10066_001.svs'

store = OpenSlideStore(wsi_fname)
grp = zarr.open(store, mode="r")
grp[2][:,:,::]
choosehappy commented 3 weeks ago

if i modify store.py to have some printout statements:

    def __getitem__(self, key: str):
        print(f"here: {key}")
        if key in self._store:
            # key is for metadata
            return self._store[key]

        # key should now be a path to an array chunk
        # e.g '3/4.5.0' -> '<level>/<chunk_key>'
        try:
            x, y, level = _parse_chunk_path(key)
            print(f"vals: {x=},{y=},{level=}")
            location = self._ref_pos(x, y, level)
            size = (self._tilesize, self._tilesize)
            print(f"{location=} {level=} {size=}")
            tile = self._slide.read_region(location, level, size)

it becomes obvious, this is the output from the wsl2 version:

here: .zgroup
here: 2/.zarray

and this is the output from the (working) windows version:

here: .zgroup
here: 2/.zarray
here: 2/0.0.0
vals: x=0,y=0,level=2
location=(0, 0) level=2 size=(512, 512)
here: 2/0.1.0
vals: x=1,y=0,level=2
location=(8194, 0) level=2 size=(512, 512)
.....
choosehappy commented 3 weeks ago

I have half an answer, i needed to downgrade zarr and everything now works as expected

zarr==2.14.2 works in both WSL2 and Windows

however the latest version is zarr==2.18.1 which does not work correctly

not sure what the breaking change is. any thoughts on how to proceed?