hdmf-dev / hdmf-zarr

Zarr I/O backend for HDMF
https://hdmf-zarr.readthedocs.io/
Other
7 stars 7 forks source link

[Bug]: NWBZarrIO appending #182

Closed CodyCBakerPhD closed 2 months ago

CodyCBakerPhD commented 3 months ago

What happened?

Minimal example below

Basic operation that should have always worked and should have tests

Steps to Reproduce

from hdmf_zarr import NWBZarrIO
from pynwb.testing.mock.file import mock_NWBFile
from pynwb.testing.mock.base import mock_TimeSeries

with NWBZarrIO(path="./test_zarr_append.nwb.zarr", mode="w") as io:
    nwbfile_out = mock_NWBFile()
    io.write(nwbfile_out)

with NWBZarrIO(path="./test_zarr_append.nwb.zarr", mode="r+") as io:
    nwbfile_in = io.read()
    time_series = mock_TimeSeries()
    nwbfile_in.add_acquisition(time_series)
    io.write(nwbfile_in)

Traceback

File ...\test_zarr_append.py:13
    io.write(nwbfile_in)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\utils.py:668 in func_call
    return func(args[0], **pargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf_zarr\backend.py:270 in write
    super(ZarrIO, self).write(**kwargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\utils.py:668 in func_call
    return func(args[0], **pargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\backends\io.py:99 in write
    self.write_builder(f_builder, **kwargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\utils.py:668 in func_call
    return func(args[0], **pargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf_zarr\backend.py:433 in write_builder
    self.write_group(

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\utils.py:668 in func_call
    return func(args[0], **pargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf_zarr\backend.py:523 in write_group
    self.write_group(

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf\utils.py:668 in func_call
    return func(args[0], **pargs)

  File D:\anaconda3\envs\neuroconv_3_10_created_4_2_2024\lib\site-packages\hdmf_zarr\backend.py:518 in write_group
    group = parent.require_group(builder.name)

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\hierarchy.py:1008 in require_group
    return self._write_op(self._require_group_nosync, name, overwrite=overwrite)

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\hierarchy.py:935 in _write_op
    return f(*args, **kwargs)

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\hierarchy.py:1015 in _require_group_nosync
    init_group(

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\storage.py:675 in init_group
    _init_group_metadata(store=store, overwrite=overwrite, path=path, chunk_store=chunk_store)

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\storage.py:737 in _init_group_metadata
    store[key] = store._metadata_class.encode_group_metadata(meta)  # type: ignore

  File ~\AppData\Roaming\Python\Python310\site-packages\zarr\storage.py:2975 in __setitem__
    raise ReadOnlyError()

ReadOnlyError: object is read-only

Operating System

Windows

Python Executable

Conda

Python Version

3.10

Package Versions

No response

Code of Conduct

mavaylon1 commented 3 months ago

I'll take a look after Dev Days

mavaylon1 commented 2 months ago

My initial concern was that the mode is changed somewhere along the line. However looking at the trace, I do not see anything where the mode changes. I checked with breakpoints and the mode stays as r+. I cloned a dev version of zarr to see if it is a zarr issue vs an issue on our end.

mavaylon1 commented 2 months ago

I do believe the issue is consolidate metadata. I am running a check.

mavaylon1 commented 2 months ago

@CodyCBakerPhD try doing this without consolidating metadata in write. It works.

with NWBZarrIO(path="./test_zarr_append.nwb.zarr", mode="w") as io:
    nwbfile_out = mock_NWBFile()
    io.write(nwbfile_out, consolidate_metadata=False)

Now looking at the zarr code, when we consolidate metadata, we use ConsolidatedMetadataStore. This interface does not seem to allow for setting items at all. In the doc string, it says it is "read only".

mavaylon1 commented 2 months ago

I will document this to be explicit within our codebase and not require people to dig through zarr

mavaylon1 commented 2 months ago

@oruebel You added

if self.__mode not in ['r' ,'r+']:
                # r- is only an internal mode in ZarrIO to force the use of regular open. For Zarr we need to
                # use the regular mode r when r- is specified
                mode_to_use = self.__mode if self.__mode != 'r-' else 'r'
                self.__file = zarr.open(store=self.path,
                                        mode=mode_to_use,
                                        synchronizer=self.__synchronizer,
                                        storage_options=self.__storage_options)
            else:
                self.__file = self.__open_file_consolidated(store=self.path,
                                                            mode=self.__mode,
                                                            synchronizer=self.__synchronizer,
                                                            storage_options=self.__storage_options)

I think we should have a condition that uses open when the mode is "r+" and not __open_file_consolidated. (This does work). There would also be a warning.

oruebel commented 2 months ago

I think we should have a condition that uses open when the mode is "r+"

Agreed, I think we should just change the check to if self.__mode == 'r' to only open with consolidated metadata when the file is in read-only mode and use the regular zarr.open otherwise.