napari / napari

napari: a fast, interactive, multi-dimensional image viewer for python
https://napari.org
BSD 3-Clause "New" or "Revised" License
2.08k stars 411 forks source link

Even lazier visualization of lazy ndarrays #318

Open d-v-b opened 4 years ago

d-v-b commented 4 years ago

🚀 Feature

Lazily construct planar views of data

Motivation

The current handling of lazy ndarrays in napari is lazy across planes, but eager within a plane. When displaying data that is chunked in x and y axes, Napari blocks until the entire plane is available as a numpy array. For very large datasets this is less than ideal.

An alternative approach is to display each chunk as it arrives. Once this mechanism is in place, it becomes possible to do fancy stuff like defer loading / rendering chunks conditioned on the current field of view (i.e., don't load stuff the user isn't looking at), etc.

More concretely, I'm working with isotropic EM data where typical dataset dimensions are [thousands, thousands, thousands], and the data are isotropically chunked. Loading a single plane at full res can take a long time, but we avoid this problem by using bigdataviewer-derived tools for looking at the data. These tools render each chunk of the data indpendently, so images start out looking patchy and get filled in over time, and the program basically never blocks.

(Less concretely, I'm imagining using napari for visualizing stuff like lazily defined synthetic data or simulations that might have insane dimensions. Chunked rendering of single planes / views would be essential for this kind of thing)

sofroniewn commented 4 years ago

We've got some support recently added for stuff like this in the Pyramid layer, see the examples/add_pyramid.py example, which will leverage a multiresolution list of arrays (could be from zarr files) to just grab the "tile" you need to see given the resolution and fov of the camera. Right now its only been tested for 2D but it should be pretty easy to extend to nD (though still only looking at slices). You'll have to have the multiresolution pyramid precomputed. Is that ok?

Here's an example zooming in on a 100,000 x 200,000 pathology slice: image_pyramid9

sofroniewn commented 4 years ago

I've extended our pyramids layer so it will work now with data that is [thousands, thousands, thousands] if you can precompute the image pyramids. Let me know if this is helpful or if you need something else

d-v-b commented 4 years ago

Thanks! I'll see how well this works for our data (some of our datasets have a precomputed pyramid). But even for data with a pyramid, it might still be helpful to render planar views in a lazy way.

I suspect that many very large datasets will not have pyramid computed, and for these cases, lazy rendering of chunks within a planar view is the only way to have a non-blocking interface.

For example, try this:

import napari
import dask.array as da
napari.view(da.random.randint(0, 255, (100, 2000,5000), dtype='uint8'))

On my 2013 macbook air, this blocks for ~15s for the view call and for ~2s every time I adjust the dimension slider. If the (massive) images were rendered chunk-by-chunk, then napari could be responsive the entire time, at the expense of showing me a patchy image.

A funny benefit of the tiled rendering that I've noticed from my relatively recent experience using BigDataViewer -- I feel like I'm much more patient with the software because I can see it doing stuff. The tiled rendering acts like a progress bar and gives me a sense of anticipation as my big image gets put together.

sofroniewn commented 4 years ago

How long does view block for a simple 512x512 image? Unfortunately I suspect most of those first 15s are creating / styling the viewer.

For the ~2s that occurs every time the dims slider updates I feel your pain, I just ran your example and the same happens to me. That comes about for how we're using vispy in the Image layer where we just send the data straight over to the graphics card, which can fit a single slice that is up to 16,000 x 16,000. I'd have to give some thought to how we'd make this non-blocking so we can remain responsive. Maybe @royerloic has some ideas.

sofroniewn commented 4 years ago

I'll add with the pyramid pre-computed though this shouldn't be as much of as problem as you can set a limit of how big each tile is - say 1600 x 1600 that is pretty fast to send over to the graphics card, but I do know that there is still some lag for some of our examples and we should work to make that more performant too

d-v-b commented 4 years ago

For the 512 x 512 image, napari blocks for ~1-2s, and scrolling seems to have ~100ms latency, which is totally usable.

At the end of the day it might be that my current problems are niche and I should just be working with pyramids anyway, but I thought raising the issue couldn't hurt :)

sofroniewn commented 4 years ago

Definitely an issue worth having on our radar, the post is much appreciated. If the precomputed image pyramids don't work out let us know too

constantinpape commented 4 years ago

I have recently started using napari for similar use-cases as @d-v-b and I also think that lazy rendering of slices would improve the user experience for large data-sets significantly. I brought this up this and other points in #471 and @sofroniewn and me decided to move the discussion of this particular feature here.

Even when pre-computing pyramid levels, the load times when scrolling to new slices can be very notice-able. One of the main reasons for this is loading data from the underlying chunked file storage (h5py or zarr/n5): for data-sets of the same size, the reactivity much better when the data-set is kept in memory versus backed by chunked file storage, which confirms that loading the data from file is the bottleneck. This makes sense, because chunks usually have a rather large extension in the z-direction, which is necessary for efficient block-wise processing with 3d algorithms. Hence, the amount of data that needs to be loaded from file is much larger than the actual slice to be displayed.

If the tiles were rendered in a lazy (i.e. non-blocking) fashion, the viewer should behave much smoother in these use-cases. I would be interested in implementing this feature with the following caveats:

Also, my time until end of August will be limited, because I need to finish up a project. @sofroniewn let me know what you think.

sofroniewn commented 4 years ago

@constantinpape this all sound good, there's no rush on this feature from our end, so early september is fine, and starting with some smaller stuff / working on this in a test environment might be good too.

there are at least 3 things we'll want to understand

  1. what are the latencies right now around just updating image data. if you look at #465 it seems like our main path has become slower lately - possible due to things like generating the thumbnail or some of the slicing, and bypassing some of these makes it faster.

  2. how much can be improved with caching. i was able to get things working quite nicely with the dask cache - see some plots on this https://github.com/napari/napari/issues/103#issuecomment-480640037, but the zarr caching didn't work so well for me.

  3. what are limitations based around the size of the 2D array to be sent to the graphics card - right now scrolling with everything in memory will still be slow if the 2D array is too large. would we choose to break that bit into tiles ourselves too even if it wasn't naturally tiled?

constantinpape commented 4 years ago
1. what are the latencies right now around just updating image data. if you look at #465 it seems like our main path has become slower lately - possible due to things like generating the thumbnail or some of the slicing, and bypassing some of these makes it faster.

Do you have any benchmarks for the viewer performance?

2\. how much can be improved with caching. i was able to get things working quite nicely with the dask cache - see some plots on this [#103 (comment)](https://github.com/napari/napari/issues/103#issuecomment-480640037), but the zarr caching didn't work so well for me.

Thanks for the pointer. I will have a look at it. Also, I am trying out some caching things myself and I will let you know if anything interesting comes up.

3\. what are limitations based around the size of the 2D array to be sent to the graphics card - right now scrolling with everything in memory will still be slow if the 2D array is too large. would we choose to break that bit into tiles ourselves too even if it wasn't naturally tiled?

Definitely sounds like something worth exploring. Also sounds like a good first issue to work on. Again, for these things some meaningful benchmarks would be great.

sofroniewn commented 4 years ago

No viewer benchmarks yet - we should definitely incorporate some into our CI. It's been something we've talked about as we've been adding features to ensure we don't have too much performance regression but we havn't gotten around to it. I think vispy has implemented something to do this on their end which we should check out. Definitely worth doing before making any serious push around performance

royerloic commented 4 years ago

I agree that we need smarter and better lazy rendering, caching and tiled render would be great to have more generally. But we first need to understand where are the bottlenecks.

I have been using napari to visualise light-sheet data with dask, and indeed it is pretty slow in the absence of caching... and for some reason moving some sliders is slower than for others. very strange. Once we get a better understanding of what are the bottlenecks we should be able to improve the situation.

For caching, one issue is that we can't tell if a cache entry is invalid, so the assumption is that we are looking at an array that is considered immutable -- although it might not be. That's probably not a big deal, but i can foresee some scenarios in which the array is undergoing 'on-the-fly- processing that one would want to 'see'.

Exciting to have @d-v-b and @constantinpape pitch in and join the napari party on this topic :-)

constantinpape commented 4 years ago

Ok, I will start looking into tiling once I have some more time. Any pointers to the general relevant area of the codebase are very welcome ;).

No viewer benchmarks yet - we should definitely incorporate some into our CI. It's been something we've talked about as we've been adding features to ensure we don't have too much performance regression but we havn't gotten around to it.

Is there an issue about this yet? Should I make one?

sofroniewn commented 4 years ago

@constantinpape there's no issue about benchmarks - do you want to make one and we can start discussing them there. @bryantChhun may have some experience too. I think that's probably the first PR to make and then we can work on the tiling.

For the tiling one of the key lines in the codebase will be where we slice the data array and then pass it over to vispy for the rendering here - https://github.com/napari/napari/blob/38f70f98249c57fc9739e4436941d9916971e9b1/napari/layers/image/image.py#L282

constantinpape commented 4 years ago

@constantinpape there's no issue about benchmarks - do you want to make one and we can start discussing them there.

See #476.

For the tiling one of the key lines in the codebase will be where we slice the data array and then pass it over to vispy for the rendering here -

Thanks for the pointer!

jni commented 4 years ago

Just to chime in here again: I actually just ran into this issue while trying the following example on a pyramid view, where one of the layers is delayed processing result:

import numpy as np
import dask.array as da
from dask.cache import Cache
from dask_image.ndfilters import gaussian_filter
from napari import Viewer, gui_qt

cache = Cache(2e9)  # Leverage two gigabytes of memory
cache.register()

pyramid = [da.random.random((2**17/2**i, 2**18/2**i), chunks=(256, 256))
           for i in range(10)]
pyramid_sigma = [gaussian_filter(arr, 5/2**i)
                 for i, arr in enumerate(pyramid)]

with gui_qt():
    # create an empty viewer
    viewer = Viewer()

    # add the pyramid
    layer = viewer.add_image(pyramid, name='slide', is_pyramid=True,
                             contrast_limits=[0, 1])
    layer_gauss = viewer.add_image(pyramid_sigma, name='blurred',
                                   is_pyramid=True,
                                   contrast_limits=[0, 1])

I think this sort of thing should work in general. Right now though, if you zoom right in, panning works until you find a block boundary, and then the viewer blocks until the whole thing is loaded. It would be much nicer if you could continue panning and get a black background as the blocks are loaded. I think this might actually not be too terrible to work out.

As a side note, after enough panning I got this error:

WARNING: could not determine DPI
WARNING:vispy:could not determine DPI
Killed

which is concerning to say the least! Is the fallback to read the whole array in?

eburling commented 4 years ago

This is reminiscent of #578. Love the idea of smooth panning and incremental loading, especially if loading of invisible layers is ignored.

jakirkham commented 4 years ago

We may have already discussed this somewhere (maybe offline?), but have you explored Dask Array's coarsen? That might be useful for creating pyramids on the fly.

sofroniewn commented 4 years ago

@jakirkham I should try again, the first time I still had to load in too many tiles from the bottom layer to ever make the top layer and it was too slow.

liu-ziyang commented 2 years ago

Hi there, I would like to close this issue to consolidate our recent effort on the async implementation plan. Thanks! (if you would like to know more about this plan, we are aggregating it here as a starting point: https://github.com/orgs/napari/projects/16)

imagesc-bot commented 2 weeks ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/napari-crashes-with-lazy-segmentation/96429/1