bioio-devs / bioio-base

Typing, base classes, and more for BioIO projects.
https://bioio-devs.github.io/bioio-base/
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

optimize get_image_data and get_image_dask_data for large images #22

Open toloudis opened 1 week ago

toloudis commented 1 week ago

Feature Description

Related to #21 For basically every image (in the base reader) we pre-create a 5D dask array containing the entire image. This can be painfully slow for very large images.

Can we skip that step and just respond on-demand to requests from get_image_data and get_image_dask_data without having to pre-create the full dask array?

Use Case

We have several sldy images (for example) for which getting dims or extracting a small sub-chunk is very slow.

Solution

Ideal solution: readers can extract subsets of the array data without ever iterating the entire image.

Alternatives

toloudis commented 1 week ago

some internal users are semi-blocked because they think the large images are hanging or crashing but really it's just taking forever to initialize. See #21 and https://github.com/bioio-devs/bioio-sldy/issues/15

toloudis commented 1 week ago

https://github.com/bioio-devs/bioio-sldy/pull/16 shows that you have to be careful how you construct the reader too. That fix should unblock our internal use of big sldy files in the near term.