ome / ome-zarr-py

Implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
https://pypi.org/project/ome-zarr
Other
150 stars 53 forks source link

writer function to go from large-zarr to ome-zarr (resolution levels + metadata) #283

Open CamachoDejay opened 1 year ago

CamachoDejay commented 1 year ago

Dear ome team,

I recently came across an interesting use case. I have couple of microscopy software tools that export very large images as a series of .tiff (or .bpm, etc) files accompanied by a metadata file (.txt, .csv, or .json) which contains the relative positions of the images (usually via centre positions).

Therefore, I started by creating a very large .zarr array by something along the lines of

total_x, total_y = helperFunction(path2metadata)

z = zarr.creation.open_array(store=store, mode='a', shape=(total_y, total_x), chunks=(chunk_size,chunk_size), dtype=tile_type)

## df is a data frame containing the tile limits
for tile_idx, row in df.iterrows():
   tile = image_reader(row['tile_path'])
   x1, x2, y1, y2 = row['tile_lims']
   z[y1:y2,x1:x2] = tile

Now, once I have the large .zarr it would be ideal to have functionality in ome-zarr-py to take this array as input and write an ome-zarr folder in a way similar to ome_zarr.writer.write_image

At the moment I had to use dask to create my own down-sampling scheme. Then I had to create appropriate minimal metadata for the ome-zarr (which was not as trivial, probably more examples of this would be great). Finally, I used write_multiscales_metadata to write the ome-zarr. A minimal example is available here: https://github.com/CamachoDejay/teaching-bioimage-analysis-python/blob/main/short_examples/zarr-from-tiles/zarr-minimal-example-tiles.ipynb

PS: I forgot to also pin here this interesting discussion: https://github.com/napari/napari/issues/5561#issuecomment-1478021044

will-moore commented 1 year ago

What you're doing in that example script doesn't seem so bad. There's some example code for writing big images to OME-Zarr (from OMERO) at https://github.com/ome/omero-cli-zarr/blob/2a7a512ab4c15d9607935560f00b51fedbf5fc79/src/omero_zarr/raw_pixels.py#L59

You should be able to use the write_image() method instead of write_multiscales(). E.g. https://ome-zarr.readthedocs.io/en/stable/python.html#writing-ome-ngff-images See https://ome-zarr.readthedocs.io/en/stable/api/writer.html#ome_zarr.writer.write_image which can take a dask array and perform downsampling to generate a multiscale image.

joshmoore commented 1 year ago

@CamachoDejay: any feeling for if that will get you what you need?