ome / ome-zarr-py

Implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
https://pypi.org/project/ome-zarr
Other
146 stars 52 forks source link

Cannot overwrite Zarr opened with `mode= "a"` #376

Open chourroutm opened 2 months ago

chourroutm commented 2 months ago

To my understanding, zarr supports rewriting with the mode= "a". However, it is not supported by ome-zarr-py.

Example code

import zarr
import ome_zarr.io, ome_zarr.writer

vol = np.zeros((512,512,512))
store = ome_zarr.io.parse_url("all_zeros.ome.zarr",mode="a").store
root = zarr.group(store=store)
ome_zarr.writer.write_image(image=vol, group=root, axes="zyx",chunks=128)

store2 = ome_zarr.io.parse_url("all_zeros.ome.zarr",mode="a").store
root2 = zarr.group(store=store2)
ome_zarr.writer.write_image(image=vol, group=root2, axes="zyx",chunks=128)

What I expect

To be able to overwrite the file all_zeros.ome.zarr when I specify mode= "a".

What I get

ContainsArrayError Traceback (most recent call last) Cell In[10], line 8 6 store2 = ome_zarr.io.parse_url(\"all_zeros.ome.zarr\",mode=\"a\").store 7 root2 = zarr.group(store=store2) ----> 8 ome_zarr.writer.write_image(image=vol, group=root2, axes=\"zyx\",chunks=128)

[...]

File .venv\Lib\site-packages\zarr\storage.py:523, in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec, dimension_separator, storage_transformers) 521 if not overwrite: 522 if contains_array(store, path): --> 523 raise ContainsArrayError(path) 524 elif contains_group(store, path, explicit_only=False): 525 raise ContainsGroupError(path)

ContainsArrayError: path '0' contains an array"

(Error log: full_error.txt)

dstansby commented 1 month ago

I can reproduce; here's the full traceback:

Traceback (most recent call last):
  File "/Users/dstansby/software/zarr/ome-zarr-py/test.py", line 12, in <module>
    ome_zarr.writer.write_image(image=vol, group=root2, axes="zyx",chunks=128)
  File "/Users/dstansby/software/zarr/ome-zarr-py/ome_zarr/writer.py", line 516, in write_image
    dask_delayed_jobs = write_multiscale(
                        ^^^^^^^^^^^^^^^^^
  File "/Users/dstansby/software/zarr/ome-zarr-py/ome_zarr/writer.py", line 267, in write_multiscale
    group.create_dataset(str(path), data=data, chunks=chunks_opt, **options)
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/hierarchy.py", line 1111, in create_dataset
    return self._write_op(self._create_dataset_nosync, name, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/hierarchy.py", line 952, in _write_op
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/hierarchy.py", line 1126, in _create_dataset_nosync
    a = array(data, store=self._store, path=path, chunk_store=self._chunk_store, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/creation.py", line 441, in array
    z = create(**kwargs)
        ^^^^^^^^^^^^^^^^
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/creation.py", line 209, in create
    init_array(
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/storage.py", line 455, in init_array
    _init_array_metadata(
  File "/Users/dstansby/miniconda3/envs/zarr/lib/python3.12/site-packages/zarr/storage.py", line 524, in _init_array_metadata
    raise ContainsArrayError(path)
zarr.errors.ContainsArrayError: path '0' contains an array
joshmoore commented 3 weeks ago

Without looking more deeply, I assume that the Store is getting re-created and all flags are not being passed. Could you test if the same happens when passing an FSStore a la https://github.com/ome/ome-zarr-py/pull/349?

will-moore commented 3 weeks ago

@joshmoore #349 allows you to use an existing store to create a ZarrLocation (e.g. via parse_url()) but if you've created your own store, then you don't need to create a ZarrLocation in this example, since you just need to use the store directly.

Anyway, I tried using the manually-created store and got the same error as above:

import zarr
import ome_zarr.io, ome_zarr.writer
import numpy as np

vol = np.zeros((512,512,512))
store = ome_zarr.io.parse_url("all_zeros.ome.zarr",mode="a").store
root = zarr.group(store=store)
ome_zarr.writer.write_image(image=vol, group=root, axes="zyx",chunks=128)

store2 = zarr.storage.FSStore(url="all_zeros.ome.zarr", mode="w")
# This would just return the same store
# store2 = ome_zarr.io.parse_url(store2, mode="w").store
root2 = zarr.group(store=store2)
ome_zarr.writer.write_image(image=vol, group=root2, axes="zyx",chunks=128)