scalableminds / webknossos-libs

Python API and CLI tools for working with WEBKNOSSOS datasets, annotations and server interactions. Includes converter to OME-Zarr.
https://docs.webknossos.org/webknossos-py/index.html
22 stars 11 forks source link

If annotations don't have enough slices... #896

Open azatian opened 1 year ago

azatian commented 1 year ago

Dear ScalableMinds,

I have been attempting to download my annotations as a segmentation mask from our server's WK instance. And I believe I have run into an issue that I can't solve. I think it's related to the underlying zarr file and how the chunk size is set up.

import webknossos as wk
dataset = wk.Dataset.open("our-zarr-link")
MAG = wk.Mag("1")
SEGMENT_IDS = [1]
mag_view = dataset.get_segmentation_layers()[0].get_mag(MAG)
the_zarr = mag_view.get_zarr_array()
the_zarr.info()

This is the output of the_zarr.info()

Type zarr.core.Array Data type uint32 Shape (1, 912, 912, 6) Chunk shape (1, 32, 32, 32) Order F Read-only False Compressor None Store type zarr.storage.FSStore No. bytes 19961856 (19.0M) No. bytes stored 161 Storage ratio 123986.7 Chunks initialized 0/841

Every time I try to do anything with mag_view including the functions read and get_buffered_slice_reader as well as any zarr array operations, even rechunking, I get the following error:

ValueError: cannot reshape array of size 0 into shape (1,32,32,32)

I tried to create a new mag with a smaller chunk size, one that fits with my data, and it seems like I can't add a mag on a layer that is hosted as a zarr link. I get this error.

ClientResponseError: 400, message='Bad Request', url=URL('our-zarr-link/Volume/2/.zarray')

pip show webknossos Version: 0.10.24

hotzenklotz commented 1 year ago

Thanks for the bug report. Have you tried updating to the latest version of the webknossos lib v0.12.3? Your package is almost a year old.

I just tried your example from above and I have no trouble with mag_view.read() with the latest version, i.e.:

import webknossos as wk

dataset = wk.Dataset.open_remote(
        dataset_name_or_url="l4dense_motta_et_al_demo_v2",
        organization_id="scalable_minds",
    )

MAG = wk.Mag("1")
SEGMENT_IDS = [1]
mag_view = dataset.get_segmentation_layers()[0].get_mag(MAG)
the_zarr = mag_view.get_zarr_array()
the_zarr.info

data = mag_view.read(absolute_bounding_box=wk.BoundingBox((10, 10, 10), (10, 10,10))) 
data.shape

>>> (1, 10, 10, 10)
azatian commented 1 year ago

Hi, thank you so much for your quick reply! I just tried with the latest version of webknossos on Python and I'm still getting the same issues. Please see below:

import webknossos as wk
import pandas as pd
import numpy as np
# posting our link for complete visibility, but will need to remove later
dataset = wk.Dataset.open("http://catmaid2.hms.harvard.edu:9000/data/annotations/zarr/4pJXHDyVK_rvi7db") 
MAG = wk.Mag("1")
SEGMENT_IDS = [1]
mag_view = dataset2.get_segmentation_layers()[0].get_mag(MAG)
the_zarr = mag_view.get_zarr_array()
the_zarr
>>> <zarr.core.Array (1, 912, 912, 6) uint32>
the_zarr.shape
>>> (1, 912, 912, 6)
the_zarr[0,:,:,:]
>>> See Error 1
data = mag_view.read() 
>>> See Error 2
data = mag_view.read(absolute_bounding_box=wk.BoundingBox((0,0,0), (10, 10,1))) 
>>> See Error 3

Error 1 ValueError Traceback (most recent call last) Input In [10], in <cell line: 1>() ----> 1 the_zarr[0,:,:,:]

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:821, in Array.getitem(self, selection) 819 result = self.vindex[selection] 820 else: --> 821 result = self.get_basic_selection(pure_selection, fields=fields) 822 return result

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields) 944 return self._get_basic_selection_zd(selection=selection, out=out, 945 fields=fields) 946 else: --> 947 return self._get_basic_selection_nd(selection=selection, out=out, 948 fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields) 984 def _get_basic_selection_nd(self, selection, out=None, fields=None): 985 # implementation of basic selection for array with at least one dimension 986 987 # setup indexer 988 indexer = BasicIndexer(selection, self) --> 990 return self._get_selection(indexer=indexer, out=out, fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1290, in Array._get_selection(self, indexer, out, fields) 1287 else: 1288 # allow storage to get multiple items at once 1289 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer) -> 1290 self._chunk_getitems(lchunk_coords, lchunk_selection, out, lout_selection, 1291 drop_axes=indexer.drop_axes, fields=fields) 1293 if out.shape: 1294 return out

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2063, in Array._chunk_getitems(self, lchunk_coords, lchunk_selection, out, lout_selection, drop_axes, fields) 2061 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection): 2062 if ckey in cdatas: -> 2063 self._process_chunk( 2064 out, 2065 cdatas[ckey], 2066 chunk_select, 2067 drop_axes, 2068 out_is_ndarray, 2069 fields, 2070 out_select, 2071 partial_read_decode=partial_read_decode, 2072 ) 2073 else: 2074 # check exception type 2075 if self._fill_value is not None:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1949, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1947 except ArrayIndexError: 1948 cdata = cdata.read_full() -> 1949 chunk = self._decode_chunk(cdata) 1951 # select data from chunk 1952 if fields:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2258, in Array._decode_chunk(self, cdata, start, nitems, expected_shape) 2256 # ensure correct chunk shape 2257 chunk = chunk.reshape(-1, order='A') -> 2258 chunk = chunk.reshape(expected_shape or self._chunks, order=self._order) 2260 return chunk

ValueError: cannot reshape array of size 0 into shape (1,32,32,32)

Error 2


ValueError Traceback (most recent call last) Input In [11], in <cell line: 1>() ----> 1 data = mag_view.read()

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/mag_view.py:209, in MagView.read(self, offset, size, relative_offset, absolute_offset, relative_bounding_box, absolute_bounding_box) 182 def read( 183 self, 184 offset: Optional[Vec3IntLike] = None, # deprecated, relative, in current mag (...) 193 ) -> np.ndarray: 194 # THIS METHOD CAN BE REMOVED WHEN THE DEPRECATED OFFSET IS REMOVED 196 if ( 197 relative_offset is not None 198 or absolute_offset is not None (...) 207 and relative_bounding_box is None 208 ): --> 209 return super().read( 210 offset, 211 size, 212 relative_offset=relative_offset, 213 absolute_offset=absolute_offset, 214 absolute_bounding_box=absolute_bounding_box, 215 relative_bounding_box=relative_bounding_box, 216 ) 217 else: 218 with warnings.catch_warnings():

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/view.py:403, in View.read(self, offset, size, relative_offset, absolute_offset, relative_bounding_box, absolute_bounding_box) 395 assert not mag1_bbox.is_empty(), ( 396 f"The size ({mag1_bbox.size} in mag1) contains a zero. " 397 + "All dimensions must be strictly larger than '0'." 398 ) 399 assert ( 400 mag1_bbox.topleft.is_positive() 401 ), f"The offset ({mag1_bbox.topleft} in mag1) must not contain negative dimensions." --> 403 return self._read_without_checks(mag1_bbox.in_mag(self._mag))

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/view.py:440, in View._read_without_checks(self, current_mag_bbox) 436 def _read_without_checks( 437 self, 438 current_mag_bbox: BoundingBox, 439 ) -> np.ndarray: --> 440 data = self._array.read( 441 current_mag_bbox.topleft.to_np(), current_mag_bbox.size.to_np() 442 ) 443 return data

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/_array.py:343, in ZarrArray.read(self, offset, shape) 341 zarray = self._zarray 342 with _blosc_disable_threading(): --> 343 data = zarray[ 344 :, 345 offset.x : (offset.x + shape.x), 346 offset.y : (offset.y + shape.y), 347 offset.z : (offset.z + shape.z), 348 ] 349 if data.shape != shape: 350 padded_data = np.zeros( 351 (self.info.num_channels,) + shape.to_tuple(), dtype=data.dtype 352 )

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:821, in Array.getitem(self, selection) 819 result = self.vindex[selection] 820 else: --> 821 result = self.get_basic_selection(pure_selection, fields=fields) 822 return result

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields) 944 return self._get_basic_selection_zd(selection=selection, out=out, 945 fields=fields) 946 else: --> 947 return self._get_basic_selection_nd(selection=selection, out=out, 948 fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields) 984 def _get_basic_selection_nd(self, selection, out=None, fields=None): 985 # implementation of basic selection for array with at least one dimension 986 987 # setup indexer 988 indexer = BasicIndexer(selection, self) --> 990 return self._get_selection(indexer=indexer, out=out, fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1290, in Array._get_selection(self, indexer, out, fields) 1287 else: 1288 # allow storage to get multiple items at once 1289 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer) -> 1290 self._chunk_getitems(lchunk_coords, lchunk_selection, out, lout_selection, 1291 drop_axes=indexer.drop_axes, fields=fields) 1293 if out.shape: 1294 return out

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2063, in Array._chunk_getitems(self, lchunk_coords, lchunk_selection, out, lout_selection, drop_axes, fields) 2061 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection): 2062 if ckey in cdatas: -> 2063 self._process_chunk( 2064 out, 2065 cdatas[ckey], 2066 chunk_select, 2067 drop_axes, 2068 out_is_ndarray, 2069 fields, 2070 out_select, 2071 partial_read_decode=partial_read_decode, 2072 ) 2073 else: 2074 # check exception type 2075 if self._fill_value is not None:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1949, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1947 except ArrayIndexError: 1948 cdata = cdata.read_full() -> 1949 chunk = self._decode_chunk(cdata) 1951 # select data from chunk 1952 if fields:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2258, in Array._decode_chunk(self, cdata, start, nitems, expected_shape) 2256 # ensure correct chunk shape 2257 chunk = chunk.reshape(-1, order='A') -> 2258 chunk = chunk.reshape(expected_shape or self._chunks, order=self._order) 2260 return chunk

ValueError: cannot reshape array of size 0 into shape (1,32,32,32)

Error 3


ValueError Traceback (most recent call last) Input In [12], in <cell line: 1>() ----> 1 data = mag_view.read(absolute_bounding_box=wk.BoundingBox((0,0,0), (10, 10,1)))

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/mag_view.py:209, in MagView.read(self, offset, size, relative_offset, absolute_offset, relative_bounding_box, absolute_bounding_box) 182 def read( 183 self, 184 offset: Optional[Vec3IntLike] = None, # deprecated, relative, in current mag (...) 193 ) -> np.ndarray: 194 # THIS METHOD CAN BE REMOVED WHEN THE DEPRECATED OFFSET IS REMOVED 196 if ( 197 relative_offset is not None 198 or absolute_offset is not None (...) 207 and relative_bounding_box is None 208 ): --> 209 return super().read( 210 offset, 211 size, 212 relative_offset=relative_offset, 213 absolute_offset=absolute_offset, 214 absolute_bounding_box=absolute_bounding_box, 215 relative_bounding_box=relative_bounding_box, 216 ) 217 else: 218 with warnings.catch_warnings():

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/view.py:403, in View.read(self, offset, size, relative_offset, absolute_offset, relative_bounding_box, absolute_bounding_box) 395 assert not mag1_bbox.is_empty(), ( 396 f"The size ({mag1_bbox.size} in mag1) contains a zero. " 397 + "All dimensions must be strictly larger than '0'." 398 ) 399 assert ( 400 mag1_bbox.topleft.is_positive() 401 ), f"The offset ({mag1_bbox.topleft} in mag1) must not contain negative dimensions." --> 403 return self._read_without_checks(mag1_bbox.in_mag(self._mag))

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/view.py:440, in View._read_without_checks(self, current_mag_bbox) 436 def _read_without_checks( 437 self, 438 current_mag_bbox: BoundingBox, 439 ) -> np.ndarray: --> 440 data = self._array.read( 441 current_mag_bbox.topleft.to_np(), current_mag_bbox.size.to_np() 442 ) 443 return data

File ~/Desktop/code/base/lib/python3.9/site-packages/webknossos/dataset/_array.py:343, in ZarrArray.read(self, offset, shape) 341 zarray = self._zarray 342 with _blosc_disable_threading(): --> 343 data = zarray[ 344 :, 345 offset.x : (offset.x + shape.x), 346 offset.y : (offset.y + shape.y), 347 offset.z : (offset.z + shape.z), 348 ] 349 if data.shape != shape: 350 padded_data = np.zeros( 351 (self.info.num_channels,) + shape.to_tuple(), dtype=data.dtype 352 )

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:821, in Array.getitem(self, selection) 819 result = self.vindex[selection] 820 else: --> 821 result = self.get_basic_selection(pure_selection, fields=fields) 822 return result

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields) 944 return self._get_basic_selection_zd(selection=selection, out=out, 945 fields=fields) 946 else: --> 947 return self._get_basic_selection_nd(selection=selection, out=out, 948 fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields) 984 def _get_basic_selection_nd(self, selection, out=None, fields=None): 985 # implementation of basic selection for array with at least one dimension 986 987 # setup indexer 988 indexer = BasicIndexer(selection, self) --> 990 return self._get_selection(indexer=indexer, out=out, fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1290, in Array._get_selection(self, indexer, out, fields) 1287 else: 1288 # allow storage to get multiple items at once 1289 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer) -> 1290 self._chunk_getitems(lchunk_coords, lchunk_selection, out, lout_selection, 1291 drop_axes=indexer.drop_axes, fields=fields) 1293 if out.shape: 1294 return out

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2063, in Array._chunk_getitems(self, lchunk_coords, lchunk_selection, out, lout_selection, drop_axes, fields) 2061 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection): 2062 if ckey in cdatas: -> 2063 self._process_chunk( 2064 out, 2065 cdatas[ckey], 2066 chunk_select, 2067 drop_axes, 2068 out_is_ndarray, 2069 fields, 2070 out_select, 2071 partial_read_decode=partial_read_decode, 2072 ) 2073 else: 2074 # check exception type 2075 if self._fill_value is not None:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1949, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1947 except ArrayIndexError: 1948 cdata = cdata.read_full() -> 1949 chunk = self._decode_chunk(cdata) 1951 # select data from chunk 1952 if fields:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2258, in Array._decode_chunk(self, cdata, start, nitems, expected_shape) 2256 # ensure correct chunk shape 2257 chunk = chunk.reshape(-1, order='A') -> 2258 chunk = chunk.reshape(expected_shape or self._chunks, order=self._order) 2260 return chunk

ValueError: cannot reshape array of size 0 into shape (1,32,32,32)

hotzenklotz commented 1 year ago

Mhm, I believe the error stems from the line:

dataset = wk.Dataset.open("http://catmaid2.hms.harvard.edu:9000/data/annotations/zarr/4pJXHDyVK_rvi7db") 

The webknossos-libs are meant to interface with WEBKNOSSOS URI for datasets and annotations, i.e. https://<wkserver>.com/annotations/6124d30a010000ab009167ed or https://<wkserver>.com/datasets/scalable_minds/110629_k0725_segmentation_v1/view.

The libs do not support loading/streaming Zarr links. Please directly use a Zarr library such as zarr-python for that. Try

import zarr

color_layer = zarr.open_group("http://catmaid2.hms.harvard.edu:9000/data/annotations/zarr/4pJXHDyVK_rvi7db")
color_layer.tree()

>> / <zarr.hierarchy.Group '/shape'> <zarr.hierarchy.Group '/dtype'>
 ├── .zgroup
 ├── Volume
 │   └── 1 (1, 912, 912, 6) uint32
 ├── cellseg
 │   └── 1 (1, 912, 912, 6) uint8
 ├── datasource-properties.json
 └── img
     └── 1 (1, 912, 912, 6) uint8

color_layer["/img/1"].shape
>> (1, 912, 912, 6)

For compatibility and interoperability purposes with other tools WK can expose it's data as Zarr-API compatbile URIs through the UI (as you did) or by exposing the raw zarr-object through the libs for working zarr-python or dask.

azatian commented 1 year ago

Thank you! I tried to use thezarr-python package....and I ran into the same issue when trying to view the annotation layer, which is Volume. The other layers work perfectly. Please see below.

color_layer["/Volume/1"].shape
>>> (1, 912, 912, 6)
color_layer["/Volume/1"][0,:,:,0]
>>> See Error 1

Error 1


ValueError Traceback (most recent call last) Input In [17], in <cell line: 1>() ----> 1 color_layer["/Volume/1"][0,:,:,0]

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:821, in Array.getitem(self, selection) 819 result = self.vindex[selection] 820 else: --> 821 result = self.get_basic_selection(pure_selection, fields=fields) 822 return result

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields) 944 return self._get_basic_selection_zd(selection=selection, out=out, 945 fields=fields) 946 else: --> 947 return self._get_basic_selection_nd(selection=selection, out=out, 948 fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields) 984 def _get_basic_selection_nd(self, selection, out=None, fields=None): 985 # implementation of basic selection for array with at least one dimension 986 987 # setup indexer 988 indexer = BasicIndexer(selection, self) --> 990 return self._get_selection(indexer=indexer, out=out, fields=fields)

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1290, in Array._get_selection(self, indexer, out, fields) 1287 else: 1288 # allow storage to get multiple items at once 1289 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer) -> 1290 self._chunk_getitems(lchunk_coords, lchunk_selection, out, lout_selection, 1291 drop_axes=indexer.drop_axes, fields=fields) 1293 if out.shape: 1294 return out

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2063, in Array._chunk_getitems(self, lchunk_coords, lchunk_selection, out, lout_selection, drop_axes, fields) 2061 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection): 2062 if ckey in cdatas: -> 2063 self._process_chunk( 2064 out, 2065 cdatas[ckey], 2066 chunk_select, 2067 drop_axes, 2068 out_is_ndarray, 2069 fields, 2070 out_select, 2071 partial_read_decode=partial_read_decode, 2072 ) 2073 else: 2074 # check exception type 2075 if self._fill_value is not None:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:1949, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1947 except ArrayIndexError: 1948 cdata = cdata.read_full() -> 1949 chunk = self._decode_chunk(cdata) 1951 # select data from chunk 1952 if fields:

File ~/Desktop/code/base/lib/python3.9/site-packages/zarr/core.py:2258, in Array._decode_chunk(self, cdata, start, nitems, expected_shape) 2256 # ensure correct chunk shape 2257 chunk = chunk.reshape(-1, order='A') -> 2258 chunk = chunk.reshape(expected_shape or self._chunks, order=self._order) 2260 return chunk

ValueError: cannot reshape array of size 0 into shape (1,32,32,32)

hotzenklotz commented 1 year ago

Thanks for the detailed listing. For completeness sake, what version or release of WEBKNOSSOS are you running?

You can find the version number in the "help" section of the navbar:

image
azatian commented 1 year ago

Hello! We are using version 22.09.0 We currently have issues with the Python client finding our annotations as datasets, so that is why I decided to try the zarr link route. I can post about that issue as well, but didn't want to convolute too many issues together.

hotzenklotz commented 1 year ago

We are using version 22.09.0

That version is more than half a year old. In this timeframe we have worked extensively with the Zarr community and improved the stability and compatibility of the Zarr APIs in WK. While I can not promise that an update will resolve your issue I am fairly confident that it might.

We currently have issues with the Python client finding our annotations as datasets

Interesting. Let's hear it. (Feel free to open a second issues/thread for that). Generally speaking Annotations ≠ Datasets. A dataset typically contains the raw microscopy data, metadata, access permissions etc. An annotation typically contains a reference to the dataset and any new volume and skeleton labels/makers that you placed. A dataset can usually have several annotations "connected" to it, e.g. 10 users doing one annotation for the same dataset (1:n relationship). Unfortunately, in reality the border between the two concepts is sometime very "fluid" and perhaps the naming of our APIs needs to be clearer. Feedback is always welcome :-)

Hence, there are different webknossos-libs Python object for Annotations and Datasets. If you only interested in the volume annotation of the dataset (and assuming that the volume annotations were created from scratch on a "blank" surface), I recommend the follow code examples: