scverse / squidpy

Spatial Single Cell Analysis in Python
https://squidpy.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
428 stars 79 forks source link

squidpy.im.ImageContainer cannot process some pyramid tiff (with z > 1) #848

Open gdurif opened 3 months ago

gdurif commented 3 months ago

Description

I try to build an ImageContainer using a "pyramid tiff" file storing an H&E stain image, i.e. an image with a Z dimension > 1.

"The term "Pyramid TIFF" is used to describe a TIFF file that wraps a sequence of bitmaps that each represent the same image at increasingly coarse spatial resolutions. The individual images may be compressed or tiled." Source

Printing the .data attribute of my container, I see that the ImageContainer seems to be correctly built around my image (c.f. below). However, when I try to plot the image or to run a segmentation on it, I get an error linked to the Z dimension being larger than 1 (c.f. below).

Minimal reproducible example

Here is an example using a tiff H&E image originating from the MoNuSeg data challenge dataset (available under CC BY-NC-SA 4.0 license).

Part of this dataset can be downloaded using the following link (github does not allow to attach tiff file). It would then just require to unzip to get the images.

Note: some images are not "pyramid tiff" and have just one Z dimension, not causing any trouble.

Trying to view the image TCGA-F9-A8NY-01Z-00-DX1.tif with squidpy.im.ImageContainer.show() or to segment it using squidpy.im.segment() return the same error.

import squidpy as sq

image = sq.im.ImageContainer("TCGA-F9-A8NY-01Z-00-DX1.tif")

image.data
#> <xarray.Dataset> Size: 12MB
#> Dimensions:  (z: 4, y: 1000, x: 1000, channels: 3)
#> Coordinates:
#>   * z        (z) <U1 16B '0' '1' '2' '3'
#> Dimensions without coordinates: y, x, channels
#> Data variables:
#>     image    (y, x, z, channels) uint8 12MB dask.array<chunksize=(1000, 1000, 4, 3), meta=np.ndarray>
#> Attributes:
#>     coords:       CropCoords(x0=0, y0=0, x1=0, y1=0)
#>     padding:      CropPadding(x_pre=0, x_post=0, y_pre=0, y_post=0)
#>     scale:        1.0
#>     mask_circle:  False

# show
image.show(layer="image", library_id = '0')

# segmentation
sq.im.segment(img=image, layer="image", library_id='0', method="watershed", thresh=90, geq=False)

Traceback

image.show(layer="image", library_id = '0') error:

```pytb --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[51], line 1 ----> 1 image.show(layer="image", library_id = '0') File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_container.py:1035, in ImageContainer.show(self, layer, library_id, channel, channelwise, segmentation_layer, segmentation_alpha, transpose, ax, figsize, dpi, save, **kwargs) 1032 if len(self.data.coords["z"]) > 1: 1033 title += f", library_id:{library_ids[z]}" -> 1035 ax_.imshow(img_as_float(img.values, force_copy=False), **kwargs) 1036 if seg_arr is not None: 1037 ax_.imshow( 1038 seg_arr[:, :, z, ...], 1039 cmap=seg_cmap, (...) 1042 **{k: v for k, v in kwargs.items() if k not in ("cmap", "interpolation")}, 1043 ) File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/dataarray.py:786, in DataArray.values(self) 773 @property 774 def values(self) -> np.ndarray: 775 """ 776 The array's data converted to numpy.ndarray. 777 (...) 784 to this array may be reflected in the DataArray as well. 785 """ --> 786 return self.variable.values File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/variable.py:540, in Variable.values(self) 537 @property 538 def values(self): 539 """The variable's data as a numpy.ndarray""" --> 540 return _as_array_or_item(self._data) File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/variable.py:338, in _as_array_or_item(data) 324 def _as_array_or_item(data): 325 """Return the given values as a numpy array, or as an individual item if 326 it's a 0d datetime64 or timedelta64 array. 327 (...) 336 TODO: remove this (replace with np.asarray) once these issues are fixed 337 """ --> 338 data = np.asarray(data) 339 if data.ndim == 0: 340 if data.dtype.kind == "M": File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/array/core.py:1693, in Array.__array__(self, dtype, **kwargs) 1692 def __array__(self, dtype=None, **kwargs): -> 1693 x = self.compute() 1694 if dtype and x.dtype != dtype: 1695 x = x.astype(dtype) File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/base.py:376, in DaskMethodsMixin.compute(self, **kwargs) 352 def compute(self, **kwargs): 353 """Compute this dask collection 354 355 This turns a lazy Dask collection into its in-memory equivalent. (...) 374 dask.compute 375 """ --> 376 (result,) = compute(self, traverse=False, **kwargs) 377 return result File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/base.py:662, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs) 659 postcomputes.append(x.__dask_postcompute__()) 661 with shorten_traceback(): --> 662 results = schedule(dsk, keys, **kwargs) 664 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_io.py:229, in _lazy_load_image..read_unprotected(fname) 224 try: 225 if fname.endswith(".tif") or fname.endswith(".tiff"): 226 # do not use imread since it changes the shape to `y, x, ?, z`, 227 # whereas we use `z, y, x, ?` in `_infer_shape_dtype` 228 # return np.reshape(imread(fname, plugin="tifffile"), shape) --> 229 return np.reshape(TiffFile(fname).asarray(), shape) 231 Image.MAX_IMAGE_PIXELS = None 232 return np.reshape(imread(fname, plugin="pil"), shape) File <__array_function__ internals>:180, in reshape(*args, **kwargs) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:298, in reshape(a, newshape, order) 198 @array_function_dispatch(_reshape_dispatcher) 199 def reshape(a, newshape, order='C'): 200 """ 201 Gives a new shape to an array without changing its data. 202 (...) 296 [5, 6]]) 297 """ --> 298 return _wrapfunc(a, 'reshape', newshape, order=order) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds) 54 return _wrapit(obj, method, *args, **kwds) 56 try: ---> 57 return bound(*args, **kwds) 58 except TypeError: 59 # A TypeError occurs if the object does have such a method in its 60 # class, but its signature is not identical to that of NumPy's. This (...) 64 # Call _wrapit from within the except clause to ensure a potential 65 # exception has a traceback chain. 66 return _wrapit(obj, method, *args, **kwds) ValueError: cannot reshape array of size 3000000 into shape (4,1000,1000,3) ```

sq.im.segment(img=image, layer="image", library_id='0', method="watershed", thresh=90, geq=False) error:

```pytb WARNING: Function changed the number of channels, cannot use identity for library ids `['2', '3', '1']`. Replacing with 0 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[49], line 1 ----> 1 sq.im.segment(img=image, layer="image", library_id='0', method="watershed", thresh=90, geq=False) File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_segment.py:338, in segment(img, layer, library_id, method, channel, chunks, lazy, layer_added, copy, **kwargs) 335 assert isinstance(method, SegmentationModel) 337 start = logg.info(f"Segmenting an image of shape `{img[layer].shape}` using `{method}`") --> 338 res: ImageContainer = method.segment( 339 img, 340 layer=layer, 341 channel=channel, 342 library_id=library_id, 343 chunks=None, 344 fn_kwargs=kwargs, 345 copy=True, 346 drop=copy, 347 lazy=lazy, 348 ) 349 logg.info("Finish", time=start) 351 if copy: File ~/.conda/envs/test-env/lib/python3.11/functools.py:946, in singledispatchmethod.__get__.._method(*args, **kwargs) 944 def _method(*args, **kwargs): 945 method = self.dispatcher.dispatch(args[0].__class__) --> 946 return method.__get__(obj, cls)(*args, **kwargs) File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_segment.py:164, in SegmentationModel._(self, img, layer, library_id, channel, fn_kwargs, **kwargs) 159 else: 160 raise TypeError( 161 f"Expected library id to be `None` or of type `str` or `sequence`, found `{type(library_id).__name__}`." 162 ) --> 164 res: ImageContainer = img.apply(func, layer=layer, channel=channel, fn_kwargs=fn_kwargs, copy=True, **kwargs) 165 res._data = res.data.rename({channel_dim: new_channel_dim}) 167 for k in res: File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_container.py:1238, in ImageContainer.apply(self, func, layer, new_layer, channel, lazy, chunks, copy, drop, fn_kwargs, **kwargs) 1235 raise ValueError(f"Expected `2`, `3` or `4` dimensional array, found `{res.ndim}`.") 1237 if copy: -> 1238 cont = ImageContainer( 1239 res, 1240 layer=new_layer, 1241 copy=True, 1242 lazy=lazy, 1243 dims=dims, 1244 library_id=new_library_ids, 1245 ) 1246 cont.data.attrs = self.data.attrs.copy() 1247 return cont File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_container.py:102, in ImageContainer.__init__(self, img, layer, lazy, scale, **kwargs) 100 self.add_img(img, layer=layer, **kwargs) 101 if not lazy: --> 102 self.compute() File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_container.py:1325, in ImageContainer.compute(self, layer) 1312 """ 1313 Trigger lazy computation in-place. 1314 (...) 1322 Modifies and returns self. 1323 """ 1324 if layer is None: -> 1325 self.data.load() 1326 else: 1327 self[layer].load() File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/dataset.py:863, in Dataset.load(self, **kwargs) 860 chunkmanager = get_chunked_array_type(*lazy_data.values()) 862 # evaluate all the chunked arrays simultaneously --> 863 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute( 864 *lazy_data.values(), **kwargs 865 ) 867 for k, data in zip(lazy_data, evaluated_data): 868 self.variables[k].data = data File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py:86, in DaskManager.compute(self, *data, **kwargs) 81 def compute( 82 self, *data: Any, **kwargs: Any 83 ) -> tuple[np.ndarray[Any, _DType_co], ...]: 84 from dask.array import compute ---> 86 return compute(*data, **kwargs) File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/base.py:662, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs) 659 postcomputes.append(x.__dask_postcompute__()) 661 with shorten_traceback(): --> 662 results = schedule(dsk, keys, **kwargs) 664 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_io.py:229, in _lazy_load_image..read_unprotected(fname) 224 try: 225 if fname.endswith(".tif") or fname.endswith(".tiff"): 226 # do not use imread since it changes the shape to `y, x, ?, z`, 227 # whereas we use `z, y, x, ?` in `_infer_shape_dtype` 228 # return np.reshape(imread(fname, plugin="tifffile"), shape) --> 229 return np.reshape(TiffFile(fname).asarray(), shape) 231 Image.MAX_IMAGE_PIXELS = None 232 return np.reshape(imread(fname, plugin="pil"), shape) File <__array_function__ internals>:180, in reshape(*args, **kwargs) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:298, in reshape(a, newshape, order) 198 @array_function_dispatch(_reshape_dispatcher) 199 def reshape(a, newshape, order='C'): 200 """ 201 Gives a new shape to an array without changing its data. 202 (...) 296 [5, 6]]) 297 """ --> 298 return _wrapfunc(a, 'reshape', newshape, order=order) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds) 54 return _wrapit(obj, method, *args, **kwds) 56 try: ---> 57 return bound(*args, **kwds) 58 except TypeError: 59 # A TypeError occurs if the object does have such a method in its 60 # class, but its signature is not identical to that of NumPy's. This (...) 64 # Call _wrapit from within the except clause to ensure a potential 65 # exception has a traceback chain. 66 return _wrapit(obj, method, *args, **kwds) ValueError: cannot reshape array of size 3000000 into shape (4,1000,1000,3) In [50]: image.show(layer="image", library_id = '0') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[50], line 1 ----> 1 image.show(layer="image", library_id = '0') File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_container.py:1035, in ImageContainer.show(self, layer, library_id, channel, channelwise, segmentation_layer, segmentation_alpha, transpose, ax, figsize, dpi, save, **kwargs) 1032 if len(self.data.coords["z"]) > 1: 1033 title += f", library_id:{library_ids[z]}" -> 1035 ax_.imshow(img_as_float(img.values, force_copy=False), **kwargs) 1036 if seg_arr is not None: 1037 ax_.imshow( 1038 seg_arr[:, :, z, ...], 1039 cmap=seg_cmap, (...) 1042 **{k: v for k, v in kwargs.items() if k not in ("cmap", "interpolation")}, 1043 ) File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/dataarray.py:786, in DataArray.values(self) 773 @property 774 def values(self) -> np.ndarray: 775 """ 776 The array's data converted to numpy.ndarray. 777 (...) 784 to this array may be reflected in the DataArray as well. 785 """ --> 786 return self.variable.values File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/variable.py:540, in Variable.values(self) 537 @property 538 def values(self): 539 """The variable's data as a numpy.ndarray""" --> 540 return _as_array_or_item(self._data) File ~/.conda/envs/test-env/lib/python3.11/site-packages/xarray/core/variable.py:338, in _as_array_or_item(data) 324 def _as_array_or_item(data): 325 """Return the given values as a numpy array, or as an individual item if 326 it's a 0d datetime64 or timedelta64 array. 327 (...) 336 TODO: remove this (replace with np.asarray) once these issues are fixed 337 """ --> 338 data = np.asarray(data) 339 if data.ndim == 0: 340 if data.dtype.kind == "M": File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/array/core.py:1693, in Array.__array__(self, dtype, **kwargs) 1692 def __array__(self, dtype=None, **kwargs): -> 1693 x = self.compute() 1694 if dtype and x.dtype != dtype: 1695 x = x.astype(dtype) File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/base.py:376, in DaskMethodsMixin.compute(self, **kwargs) 352 def compute(self, **kwargs): 353 """Compute this dask collection 354 355 This turns a lazy Dask collection into its in-memory equivalent. (...) 374 dask.compute 375 """ --> 376 (result,) = compute(self, traverse=False, **kwargs) 377 return result File ~/.conda/envs/test-env/lib/python3.11/site-packages/dask/base.py:662, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs) 659 postcomputes.append(x.__dask_postcompute__()) 661 with shorten_traceback(): --> 662 results = schedule(dsk, keys, **kwargs) 664 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) File ~/.conda/envs/test-env/lib/python3.11/site-packages/squidpy/im/_io.py:229, in _lazy_load_image..read_unprotected(fname) 224 try: 225 if fname.endswith(".tif") or fname.endswith(".tiff"): 226 # do not use imread since it changes the shape to `y, x, ?, z`, 227 # whereas we use `z, y, x, ?` in `_infer_shape_dtype` 228 # return np.reshape(imread(fname, plugin="tifffile"), shape) --> 229 return np.reshape(TiffFile(fname).asarray(), shape) 231 Image.MAX_IMAGE_PIXELS = None 232 return np.reshape(imread(fname, plugin="pil"), shape) File <__array_function__ internals>:180, in reshape(*args, **kwargs) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:298, in reshape(a, newshape, order) 198 @array_function_dispatch(_reshape_dispatcher) 199 def reshape(a, newshape, order='C'): 200 """ 201 Gives a new shape to an array without changing its data. 202 (...) 296 [5, 6]]) 297 """ --> 298 return _wrapfunc(a, 'reshape', newshape, order=order) File ~/.conda/envs/test-env/lib/python3.11/site-packages/numpy/core/fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds) 54 return _wrapit(obj, method, *args, **kwds) 56 try: ---> 57 return bound(*args, **kwds) 58 except TypeError: 59 # A TypeError occurs if the object does have such a method in its 60 # class, but its signature is not identical to that of NumPy's. This (...) 64 # Call _wrapit from within the except clause to ensure a potential 65 # exception has a traceback chain. 66 return _wrapit(obj, method, *args, **kwds) ValueError: cannot reshape array of size 3000000 into shape (4,1000,1000,3) ```

Version

'1.5.0'

Thanks in advance, Best

giovp commented 3 months ago

hi @gdurif indeed unfortunately atm we don't support methods to operate on 3d image data. We are currently working towards porting some these functionalities to spatialdata. Will keep this open for reference

gdurif commented 3 months ago

@giovp thanks for your reply.

According to the documentation, it should be possible to use library_id input argument to choose which slice (in the Z dim) of the image to use ?

This point was discussed in #321 and #329 so I guess it should work with the current version of squidpy ?