Generate a multiscale, chunked, multi-dimensional spatial image data structure that can serialized to OME-NGFF.
Each scale is a scientific Python Xarray Dataset, organized into nodes of an Xarray Datatree.
pip install multiscale_spatial_image
import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import to_multiscale
import zarr
# Image pixels
array = np.random.randint(0, 256, size=(128,128), dtype=np.uint8)
image = to_spatial_image(array)
print(image)
An Xarray DataArray. Spatial metadata can also be passed during construction.
<xarray.DataArray 'image' (y: 128, x: 128)> Size: 16kB
array([[170, 79, 215, ..., 31, 151, 150],
[ 77, 181, 1, ..., 217, 176, 228],
[193, 91, 240, ..., 132, 152, 41],
...,
[ 50, 140, 231, ..., 80, 236, 28],
[ 89, 46, 180, ..., 84, 42, 140],
[ 96, 148, 240, ..., 61, 43, 255]], dtype=uint8)
Coordinates:
* y (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
* x (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
# Create multiscale pyramid, downscaling by a factor of 2, then 4
multiscale = to_multiscale(image, [2, 4])
print(multiscale)
A chunked Dask Array MultiscaleSpatialImage Xarray.
<xarray.DataTree>
Group: /
├── Group: /scale0
│ Dimensions: (y: 128, x: 128)
│ Coordinates:
│ * y (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│ * x (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│ Data variables:
│ image (y, x) uint8 16kB dask.array<chunksize=(128, 128), meta=np.ndarray>
├── Group: /scale1
│ Dimensions: (y: 64, x: 64)
│ Coordinates:
│ * y (y) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│ * x (x) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│ Data variables:
│ image (y, x) uint8 4kB dask.array<chunksize=(64, 64), meta=np.ndarray>
└── Group: /scale2
Dimensions: (y: 16, x: 16)
Coordinates:
* y (y) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
* x (x) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
Data variables:
image (y, x) uint8 256B dask.array<chunksize=(16, 16), meta=np.ndarray>
Map a function over datasets while skipping nodes that do not contain dimensions
import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import skip_non_dimension_nodes, to_multiscale
data = np.zeros((2, 200, 200))
dims = ("c", "y", "x")
scale_factors = [2, 2]
image = to_spatial_image(array_like=data, dims=dims)
multiscale = to_multiscale(image, scale_factors=scale_factors)
@skip_non_dimension_nodes
def transpose(ds, *args, **kwargs):
return ds.transpose(*args, **kwargs)
multiscale = multiscale.map_over_datasets(transpose, "y", "x", "c")
print(multiscale)
A transposed MultiscaleSpatialImage.
<xarray.DataTree>
Group: /
├── Group: /scale0
│ Dimensions: (c: 2, y: 200, x: 200)
│ Coordinates:
│ * c (c) int32 8B 0 1
│ * y (y) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│ * x (x) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│ Data variables:
│ image (y, x, c) float64 640kB dask.array<chunksize=(200, 200, 2), meta=np.ndarray>
├── Group: /scale1
│ Dimensions: (c: 2, y: 100, x: 100)
│ Coordinates:
│ * c (c) int32 8B 0 1
│ * y (y) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│ * x (x) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│ Data variables:
│ image (y, x, c) float64 160kB dask.array<chunksize=(100, 100, 2), meta=np.ndarray>
└── Group: /scale2
Dimensions: (c: 2, y: 50, x: 50)
Coordinates:
* c (c) int32 8B 0 1
* y (y) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
* x (x) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
Data variables:
image (y, x, c) float64 40kB dask.array<chunksize=(50, 50, 2), meta=np.ndarray>
While the decorator allows you to define your own methods to map over datasets
in the DataTree
while ignoring those datasets not having dimensions, this
library also provides a few convenience methods. For example, the transpose
method we saw earlier can also be applied as follows:
multiscale = multiscale.msi.transpose("y", "x", "c")
Other methods implemented this way are reindex
, equivalent to the
xr.DataArray
reindex
method and assign_coords
, equivalent to xr.Dataset
assign_coords
method.
Store as an Open Microscopy Environment-Next Generation File Format (OME-NGFF) / netCDF store.
It is highly recommended to use dimension_separator='/'
in the construction of
the Zarr stores.
store = zarr.storage.DirectoryStore('multiscale.zarr', dimension_separator='/')
multiscale.to_zarr(store)
Note: The API is under development, and it may change until 1.0.0 is released. We mean it :-).
Contributions are welcome and appreciated.
git clone https://github.com/spatial-image/multiscale-spatial-image
cd multiscale-spatial-image
First install pixi. Then, install project dependencies:
pixi install -a
pixi run pre-commit-install
The unit tests:
pixi run -e test test
The notebooks tests:
pixi run test-notebooks
To add new or update testing data, such as a new baseline for this block:
dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
verify_against_baseline(test_data_dir, dataset_name, baseline_name, multiscale)
Add a store_new_image
call in your test block:
dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
store_new_image(dataset_name, baseline_name, multiscale)
verify_against_baseline(dataset_name, baseline_name, multiscale)
Run the tests to generate the output. Remove the store_new_image
call.
Then, create a tarball of the current testing data
cd test/data
tar cvf ../data.tar *
gzip -9 ../data.tar
python3 -c 'import pooch; print(pooch.file_hash("../data.tar.gz"))'
Update the test_data_sha256
variable in the test/_data.py file. Upload the
data to web3.storage. And update the
test_data_ipfs_cid
Content Identifier (CID) variable,
which is available in the web3.storage web page interface.
We use the standard GitHub flow.
This section is relevant only for maintainers.
git
's main
branch.pixi install -a
pixi run pre-commit-install
pixi run -e test test
pixi shell
hatch version <new-version>
git add .
git commit -m "ENH: Bump version to <version>"
hatch build
hatch publish
git push upstream main