fractal-analytics-platform / fractal-vizarr-viewer

Prototype to explore serving/viewing zarr data
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Open issue in vizarr about viewing wells #24

Open tcompa opened 2 months ago

tcompa commented 2 months ago

We currently open ome-zarr images, and it works. When we go one level up (ome-zarr well), it does not work. We should reproduce this issue locally and identify where the problem comes from, and we should open the issue on vizarr once we are able to observe it on some publicly-available dataset.

jluethi commented 2 months ago

To have a concrete example (within the UZH network & with Fractal authentication): We can load https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09/0

But https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09 does not appear to load.

zonia3000 commented 2 months ago

It looks that it is trying to load something but it is overloaded. For the second link I have more than 400 requests, each one having more than 5MB.

image

Do you expect that it loads so many items?

tcompa commented 2 months ago

5 MB is the size of a single image (roughly), i.e. a single zarr chunk

the full-resolution array has this .zarray:

{
    "chunks": [
        1,
        1,
        2160,
        2560
    ],
    "compressor": {
        "blocksize": 0,
        "clevel": 5,
        "cname": "lz4",
        "id": "blosc",
        "shuffle": 1
    },
    "dimension_separator": "/",
    "dtype": "<u2",
    "fill_value": 0,
    "filters": null,
    "order": "C",
    "shape": [
        3,
        19,
        19440,
        20480
    ],
    "zarr_format": 2

which suggests about 7000 chunks the actual number of chunks, on disk, are 4104, likely because some chunks are empty

loading 4000 files of up to 5M each won't really work, I guess

tcompa commented 2 months ago

To have a concrete example (within the UZH network & with Fractal authentication): We can load https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09/0

But https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09 does not appear to load.

This seems to be a resolution issue.

If we use a smaller dataset, vizarr does load the well:

https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/examples/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03

tcompa commented 2 months ago

the question then becomes whether vizarr can load a well at a given resolution level

jluethi commented 2 months ago

400 requests, each one having more than 5MB.

Ah, it looks like vizarr doesn't have multi-resolution support for wells then. It also doesn't have this on the plate level. But on the plate level, it defaults to loading the lowest resolution. Apparently it defaults to the highest resolution on the well level... That's an interesting choice.

On the image level, it dynamically loads the best resolution given the Zoom levels

tcompa commented 2 months ago

But on the plate level, it defaults to loading the lowest resolution. Apparently it defaults to the highest resolution on the well level

Yes, this is confirmed by looking at https://github.com/hms-dbmi/vizarr/blob/main/src/ome.ts:

// in loadPlate:
  // Lowest resolution is the 'path' of the last 'dataset' from the first multiscales
  const { datasets } = imgAttrs.multiscales[0];
  const resolution = datasets[datasets.length - 1].path;

// in loadWell
  utils.assert(utils.isMultiscales(imgAttrs), "Path for image is not valid.");
  let resolution = imgAttrs.multiscales[0].datasets[0].path;

it looks like vizarr doesn't have multi-resolution support for wells then.

This is not fully clear yet - more on this later.

jluethi commented 2 months ago

Why would it request 400 chunks?

We have an array of shape [3, 19, 19440, 20480]. The second dimension is Z. We only load a single Z plane, but all the channels (3) & all the xy (=> 9x8 chunks).

Therefore, I'd expect 983 chunks to get loaded => 216 chunks

zonia3000 commented 2 months ago

Why would it request 400 chunks?

When I took the screenshot the application hadn't completed to load the page yet. I was still creating more requests.

jluethi commented 2 months ago

I let it run for a while on my end with the network console open. It did run 445 requests. But even when all of them finished after 1.1 min, it did not display anything

Screenshot 2024-09-19 at 16 04 59
jluethi commented 2 months ago

Could be related to these warnings in the console though: WebGL: INVALID_VALUE: texImage2D: width or height out of range

jluethi commented 2 months ago

My hypothesis: Similar to the napari limit for max image size, vizarr has a limit like this through webGL. Instead of downsampling the image (like napari), it just doesn't show anything in that case.

Thus, what happens on the well case is:

  1. It loads the full resolution chunks. That takes a while (~1 min on my network)
  2. When it has loaded the whole well at full resolution, it cannot display it because the resulting image is too large => it shows black

The texImage2D warning does not show up for smaller wells like the one from Tommaso above

tcompa commented 2 months ago

Re: number of requests

I confirm that there exist 216 chunk files for the Z plane with index 0 on disk - for the B/09 well. (somewhere above I forgot to consider that we are not loading all Z planes at once)

But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk. For instance an example with 6 chunks to load leads to 18 requests: image

tcompa commented 2 months ago

It loads the full resolution chunks

This is what we should understand better. It clearly does so (as in resolution = imgAttrs.multiscales[0].datasets[0].path), but it's unclear whether it's a minor bug and we can set it to something else or for some other reason.

jluethi commented 2 months ago

But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk.

Yes. But that example contains 6 requests for actual chunks, the other 12 are for .zarray and other stuff

jluethi commented 2 months ago

This is what we should understand better. It clearly does so (as in resolution = imgAttrs.multiscales[0].datasets[0].path), but it's unclear whether it's a minor bug and we can set it to something else or for some other reason.

I suspect it's just the current design limitation. e.g. the viewer was made to work on example data by loading lowest res on the plate and highest res per well.

I need to explore how it handles the multiple images per well case. But stitching images in a well into one big image is not super typical. We do it, FAIM at FMI does it, the Allen Cell people do it. But many public datasets have images still saved as many images of approximately 2000x2000 => in that case, maybe it's fine to load at highest res, even if it's not very performant?

tcompa commented 2 months ago

But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk.

Yes. But that example contains 6 requests for actual chunks, the other 12 are for .zarray and other stuff

For the screenshot in https://github.com/fractal-analytics-platform/fractal-vizarr-viewer/issues/24#issuecomment-2361087061, do you know how many requests out of your 445 are loading images?

If it's 445 image-loading requests, this is unexpected (although it would be consistent with the 2000M memory use reported there, since 2000M/400~5M).

jluethi commented 2 months ago

I looks like there are about 400 ish requests to zarr chunks. The list has the same initial overhead plus 2 js files at the end, but the rest are zarr chunks.

Screenshot 2024-09-19 at 16 23 08 Screenshot 2024-09-19 at 16 24 58

I count 13 non-chunks 432 chunks. As if the 216 chunks are all loaded twice somehow. But that's not something that appears on the smaller dataset

For reference, I use https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/09 (the MIP version) to avoid confusions with the Z planes

tcompa commented 2 months ago

As if the 216 chunks are all loaded twice somehow.

That's definitely something to understand better!


Overall, we have quite some additional information now, so that we can look for a public-dataset example and start some discussion over at vizarr.

jluethi commented 2 months ago

Will be great to look into it with different public datasets indeed! Unfortunately, our larger test dataset here isn't public yet.

To summarize issues we've highlighted here:

  1. Well loading always loads images at full resolution. This does not scale well for large wells, especially in scenarios where people save multiple field of views of a well as a single image (see discussion in https://github.com/ome/ngff/pull/137)
  2. Apparently the vizarr viewer has some image width & height limit. Would be great to understand that one better
  3. When the viewer can't load an image due to width/height issues, would be useful to display the error
  4. Our large dataset appears to have some issues with potentially loading chunks multiple times? To be verified

Things that remain to be tested: How does the vizarr viewer handle wells with multiple images? (for us e.g. the multiplexing cases)

jluethi commented 2 months ago

Ok and for the record of what happens when a well has multiple images: vizarr tiles those images, e.g. places them next to each other.

For our multiplexing case, that's not what we want. Those images are different acquisitions, i.e. should be shown overlapped. But that's another topic.

Depending on how that tiling is done, maybe large wells (with many FOVs saved as separate OME-Zarr images) still run into the issue with well image size. Or maybe it's fine if they are loaded from separate images? Hard to test without knowing the width/height limits

Screenshot 2024-09-19 at 16 40 21
zonia3000 commented 1 month ago

Related: https://github.com/hms-dbmi/vizarr/issues/76

jluethi commented 1 month ago

Most relevant part being:

But yes, this could certainly be improved to pick a suitable resolution based on the size and number of the Well images.

=> I think we see the limits of this design choice :)