[v3] `Buffer` ensure correct subclass based on the `BufferPrototype` argument

madsbk commented 4 weeks ago

TODO:

[x] Add unit tests and/or doctests in docstrings
[x] Add docstrings and API docs for any new/modified user-facing classes and functions
[x] New/modified features documented in docs/tutorial.rst
[ ] Changes documented in docs/release.rst
[x] GitHub Actions have all passed
[ ] Test coverage is 100% (Codecov passes)

d-v-b commented 3 weeks ago

I have a broader question about the relationship between the store API and different flavors of buffer (This question is out of scope for this PR, but the answer would help me interpret the effort here).

Taking MemoryStore.get as an example, we have

async def get(
        self,
        key: str,
        prototype: BufferPrototype,
        byte_range: tuple[int | None, int | None] | None = None,
    ) -> Buffer | None:

key and byte range are parametrizing the resource that the store is going to get; we can think of this as basically a query against a {keys: seekable_bytes} map-abstraction of a storage backend. But prototype is different -- it's determining the concrete return type of get.

Do we expect arrays / groups to dynamically use prototype parameter? For example, is the same array instance going to call store.get(prototype=GPUBuffer) for some chunks, and store.get(prototype=CPUBuffer) for other chunks? Alternatively, we might want to constrain a store instance to always use a fixed buffer type, in which case maybe the prototype should be an attribute of the store class? Sorry if I missed a discussion that explains the design here.

madsbk commented 3 weeks ago

Do we expect arrays / groups to dynamically use prototype parameter? For example, is the same array instance going to call store.get(prototype=GPUBuffer) for some chunks, and store.get(prototype=CPUBuffer) for other chunks?

Yes, e.g. for GPU backed arrays we want the metadata to use CPUBuffer and the data to use GPUBuffer. Furthermore, the end-user might also want to load some part of the array into a numpy array and other parts into a cupy array.

d-v-b commented 3 weeks ago

thanks! that's very helpful

madsbk commented 3 weeks ago

Thanks @d-v-b !

zarr-developers / zarr-python

[v3] `Buffer` ensure correct subclass based on the `BufferPrototype` argument #1974