bluesky / tiled

API to structured data
https://blueskyproject.io/tiled
BSD 3-Clause "New" or "Revised" License
53 stars 48 forks source link

Support sub-chunk slicing in the Python client #610

Open danielballan opened 8 months ago

danielballan commented 8 months ago

The Tiled server supporting arbitrary slicing, like:

GET /api/v1/array/full/{path}?slice=..

and

GET /api/v1//array/block/{path}?block=...&slice=...

The Tiled Python client does not take full advantage of this yet. It always grabs whole chunks, and then performs any further (sub-chunk) slicing on the client side. This results in wasted work. The reason for accepting this waste was to keep the implementation simple and solidly correct, in the days of early development.

Now, it makes sense to revisit this and specify the precise slice of interest in the request to the server, such that it only reads and transmits the necessary data. The change must include thorough unit testing, as slicing can involve subtle book-keeping.

The relevant code is:

https://github.com/bluesky/tiled/blob/204676bbb841d252c1413abfe2646c3b7d27a525/tiled/client/array.py#L67-L126

I would guess that the library ndindex may be useful for implementing this correctly. And maybe (maybe!) Slice.as_subindex is the right tool to deal with mapping a slice into a whole array into the coordinate systems of the individual chunks. Needs more investigation.

danielballan commented 8 months ago

This may be of interest to @taxe10.

hyperrealist commented 3 weeks ago

I am taking a look at this issue. Slice.as_subindex does look promising, but your link from 2023 seems broken. Here's the new link for reference:

https://quansight-labs.github.io/ndindex/api/internal.html#ndindex.ndindex.NDIndex.as_subindex