Open tcompa opened 2 months ago
To have a concrete example (within the UZH network & with Fractal authentication): We can load https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09/0
But https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09 does not appear to load.
It looks that it is trying to load something but it is overloaded. For the second link I have more than 400 requests, each one having more than 5MB.
Do you expect that it loads so many items?
5 MB is the size of a single image (roughly), i.e. a single zarr chunk
the full-resolution array has this .zarray:
{
"chunks": [
1,
1,
2160,
2560
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dimension_separator": "/",
"dtype": "<u2",
"fill_value": 0,
"filters": null,
"order": "C",
"shape": [
3,
19,
19440,
20480
],
"zarr_format": 2
which suggests about 7000 chunks the actual number of chunks, on disk, are 4104, likely because some chunks are empty
loading 4000 files of up to 5M each won't really work, I guess
To have a concrete example (within the UZH network & with Fractal authentication): We can load https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09/0
But https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr/B/09 does not appear to load.
This seems to be a resolution issue.
If we use a smaller dataset, vizarr does load the well:
the question then becomes whether vizarr can load a well at a given resolution level
400 requests, each one having more than 5MB.
Ah, it looks like vizarr doesn't have multi-resolution support for wells then. It also doesn't have this on the plate level. But on the plate level, it defaults to loading the lowest resolution. Apparently it defaults to the highest resolution on the well level... That's an interesting choice.
On the image level, it dynamically loads the best resolution given the Zoom levels
But on the plate level, it defaults to loading the lowest resolution. Apparently it defaults to the highest resolution on the well level
Yes, this is confirmed by looking at https://github.com/hms-dbmi/vizarr/blob/main/src/ome.ts:
// in loadPlate:
// Lowest resolution is the 'path' of the last 'dataset' from the first multiscales
const { datasets } = imgAttrs.multiscales[0];
const resolution = datasets[datasets.length - 1].path;
// in loadWell
utils.assert(utils.isMultiscales(imgAttrs), "Path for image is not valid.");
let resolution = imgAttrs.multiscales[0].datasets[0].path;
it looks like vizarr doesn't have multi-resolution support for wells then.
This is not fully clear yet - more on this later.
Why would it request 400 chunks?
We have an array of shape [3, 19, 19440, 20480]. The second dimension is Z. We only load a single Z plane, but all the channels (3) & all the xy (=> 9x8 chunks).
Therefore, I'd expect 983 chunks to get loaded => 216 chunks
Why would it request 400 chunks?
When I took the screenshot the application hadn't completed to load the page yet. I was still creating more requests.
I let it run for a while on my end with the network console open. It did run 445 requests. But even when all of them finished after 1.1 min, it did not display anything
Could be related to these warnings in the console though:
WebGL: INVALID_VALUE: texImage2D: width or height out of range
My hypothesis: Similar to the napari limit for max image size, vizarr has a limit like this through webGL. Instead of downsampling the image (like napari), it just doesn't show anything in that case.
Thus, what happens on the well case is:
The texImage2D warning does not show up for smaller wells like the one from Tommaso above
Re: number of requests
I confirm that there exist 216 chunk files for the Z plane with index 0 on disk - for the B/09 well. (somewhere above I forgot to consider that we are not loading all Z planes at once)
But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk. For instance an example with 6 chunks to load leads to 18 requests:
It loads the full resolution chunks
This is what we should understand better. It clearly does so (as in resolution = imgAttrs.multiscales[0].datasets[0].path
), but it's unclear whether it's a minor bug and we can set it to something else or for some other reason.
But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk.
Yes. But that example contains 6 requests for actual chunks, the other 12 are for .zarray and other stuff
This is what we should understand better. It clearly does so (as in resolution = imgAttrs.multiscales[0].datasets[0].path), but it's unclear whether it's a minor bug and we can set it to something else or for some other reason.
I suspect it's just the current design limitation. e.g. the viewer was made to work on example data by loading lowest res on the plate and highest res per well.
I need to explore how it handles the multiple images per well case. But stitching images in a well into one big image is not super typical. We do it, FAIM at FMI does it, the Allen Cell people do it. But many public datasets have images still saved as many images of approximately 2000x2000 => in that case, maybe it's fine to load at highest res, even if it's not very performant?
But note that the number of requests is higher of the number of chunks, as not every single request is to load a chunk.
Yes. But that example contains 6 requests for actual chunks, the other 12 are for .zarray and other stuff
For the screenshot in https://github.com/fractal-analytics-platform/fractal-vizarr-viewer/issues/24#issuecomment-2361087061, do you know how many requests out of your 445 are loading images?
If it's 445 image-loading requests, this is unexpected (although it would be consistent with the 2000M memory use reported there, since 2000M/400~5M).
I looks like there are about 400 ish requests to zarr chunks. The list has the same initial overhead plus 2 js files at the end, but the rest are zarr chunks.
I count 13 non-chunks 432 chunks. As if the 216 chunks are all loaded twice somehow. But that's not something that appears on the smaller dataset
For reference, I use https://fractal-bvc.mls.uzh.ch/vizarr/?source=https://fractal-bvc.mls.uzh.ch/vizarr/data/shares/prbvc.biovision.uzh/joel_testing/20240723_23_well_plate/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/09 (the MIP version) to avoid confusions with the Z planes
As if the 216 chunks are all loaded twice somehow.
That's definitely something to understand better!
Overall, we have quite some additional information now, so that we can look for a public-dataset example and start some discussion over at vizarr.
Will be great to look into it with different public datasets indeed! Unfortunately, our larger test dataset here isn't public yet.
To summarize issues we've highlighted here:
Things that remain to be tested: How does the vizarr viewer handle wells with multiple images? (for us e.g. the multiplexing cases)
Ok and for the record of what happens when a well has multiple images: vizarr tiles those images, e.g. places them next to each other.
For our multiplexing case, that's not what we want. Those images are different acquisitions, i.e. should be shown overlapped. But that's another topic.
Depending on how that tiling is done, maybe large wells (with many FOVs saved as separate OME-Zarr images) still run into the issue with well image size. Or maybe it's fine if they are loaded from separate images? Hard to test without knowing the width/height limits
Most relevant part being:
But yes, this could certainly be improved to pick a suitable resolution based on the size and number of the Well images.
=> I think we see the limits of this design choice :)
We currently open ome-zarr images, and it works. When we go one level up (ome-zarr well), it does not work. We should reproduce this issue locally and identify where the problem comes from, and we should open the issue on vizarr once we are able to observe it on some publicly-available dataset.