Open d-v-b opened 1 month ago
so i figured out where this is coming from. i'm not sure if it's a bug or not.
is_open
to False
.a[:]
invokes MemoryStore.get()
, which hits this line:
if not self._is_open:
await self._open()
Store._open()
does different things depending on its mode
parameter. it seems that the default mode
is AccessMode(readonly=False, overwrite=True, create=True, update=False)
. Because overwrite
is True
here, Store._open
wipes the store before fetching data. I don't fully understand the open / closed semantics here. My intuition is that re-opening a closed store feels weird, and having the store._open()
routine invoked outside of Store.__init__
feels incorrect.
@brokkoli71 what's the logic for invoking Store._open
in the get
and set
methods? I feel like it might be cleaner to only open the store in 1 place (__init__
), and attempting to call Store.get/set
on a closed store should simply error.
tests are now passing with this PR. Instead of conditionally calling store._open()
in store.get
or store.set
, a RuntimeError
is raised when attempting to use these methods on a store that is closed. I also adjusted all the tests to use sync(Store.open())
instead of Store()
.
I think we should not merge this until we have a conversation about the store design. I would really like to be able to create store classes in synchronous code without needing sync
, but we can't do that now because (some) store initialization requires IO, which is async.
@jhamman and @normanrz any thoughts about this issue (i.e., that we cannot fully initialize a store class in synchronous code without sync
)?
So is the bug here thati was reusing the array object I wrote to? But instead I should create a new one?
So is the bug here thati was reusing the array object I wrote to? But instead I should create a new one?
The bug was that the store class self-destructs if a) it was created with overwrite=True
, and b) you attempt to do any array indexing after closing the store.
@d-v-b @dcherian
Store._open()
does different things depending on itsmode
parameter. it seems that the defaultmode
isAccessMode(readonly=False, overwrite=True, create=True, update=False)
. Becauseoverwrite
isTrue
here,Store._open
wipes the store before fetching data.
In this example the mode was explicitly set to "w"
which will overwrite the data when opening a store. The default value of MemoryStore()
is mode="r"
which will raise a ValueError
in this code snippet as it does not allow writing. I think, what the code snippet was trying to achieve was for reusing a closed array which can be done with mode "r+"
or "a".
I'd say this illustrates that the effect of the AccessModeLiterals should be communicated better. Currently only the asynchronous open method contains docstrings explaining:
mode : {'r', 'r+', 'a', 'w', 'w-'}, optional
Persistence mode: 'r' means read only (must exist); 'r+' means
read/write (must exist); 'a' means read/write (create if doesn't
exist); 'w' means create (overwrite if exists); 'w-' means create
(fail if exists).
Furthermore, the examples in the docstrings of array.py suggest using mode="w"
. I think overwriting stores on opening them has it's usecase but should not be communicated as a general good idea.
The other problem with implicitly opening a store is a problem, too. I agree with the comments above.
Edit:
Also, in store/core.py::make_store_path the mode will fallback to "w"
:
elif store_like is None:
if mode is None:
mode = "w" # exception to the default mode = 'r'
I think it make sense here to fallback to "a"
, too
I had a similar thought process while working on https://github.com/zarr-developers/zarr-python/pull/2000. On the one hand sync(Store.open())
seems a bit much for the API, on the other hand implicitly calling _open
might cause confusion. Currently the only modification of _open
on the array is clearing it if overwrite
is enabled. Maybe if we make that behavior more clear to the user that might not be a problem? what do you think? @d-v-b
Over in #1746 @dcherian discovered the following bug:
Setting data with
a[:] = nparray
, followed bya.store_path.store.close()
, followed bya[:]
, results in the store getting wiped. This is not intended behavior.this PR adds:
TODO: