silx-kit / h5web

React components for data visualization and exploration
https://h5web.panosc.eu/
MIT License
167 stars 17 forks source link

Don't use debounce on slice slider (update as fast as possible) #1578

Open bmaranville opened 4 months ago

bmaranville commented 4 months ago

Is your feature request related to a problem?

When using the slicing slider (in DimensionMapper) for heatmaps, the plot is not updated smoothly when the slider is moved, even when using h5wasm provider (which should allow very fast updates).

The debounce that is applied is preventing updates to the plot.

Requested solution or feature

Instead of debouncing the slider signal, I think a different approach is merited here and would provide huge gains in responsiveness:

  1. add a separate independent requestAnimationFrame loop that checks if the heatmap needs to be redrawn
    • if "needs redraw" flag is true and not "drawing" flag, set "drawing" = true:
      • set "needs redraw" = false
      • start redraw with current slider value (redraw operation should do the data slicing, which could be slow if coming from a server)
      • when fetch + redraw is complete, set "drawing" = false and continue animation loop
    • if "needs redraw" is false, do nothing and continue animation loop.
  2. when the slider is moved, set the "needs redraw" flag to true.

The advantage of this system over debounce is that it updates as fast as the (data slice + plot) operation can be carried out, rather than arbitrarily delaying execution of the slice by the debounce timers. There is natural rate-limiting to slice requests from e.g. H5Grove because no more than one request will be issued at a time. Providers with very fast slice turnarounds (e.g. h5wasm) will be able to update nearly instantly (smooth visual scrolling through datasets!)

Alternatives you've considered

Additional context

I have implemented this type of buffering on many occasions for smooth updates of visualizations from user inputs. I'm happy to help with this.

I couldn't find a nice requestAnimationFrame equivalent to the react-hookz/web debounce tools being used, so it might have to be coded by hand rather than using a library.

bmaranville commented 4 months ago

Here is a demo with debouncing turned off:

https://github.com/silx-kit/h5web/assets/686570/4d791ea8-c83f-49c7-a1e5-a16caccbb209

loichuder commented 4 months ago

Great call @bmaranville, I wholly agree.

The debounce indeed only makes sense when using providers that make requests for each slice such as H5Grove or HSDS. It only gets in the way of responsiveness when using the H5Wasm provider.

axelboc commented 4 months ago

I agree that the debouncing is not ideal. Our main goal at the time was to avoid spamming h5grove with concurrent requests while keeping the slider responsive. I think this was before h5wasm came along.

If I'm not mistaken, the solution you suggest, while logical for h5wasm, has the major downside that it would slow down slicing with h5grove: when the slider moves from index 0 to 1 and then to index 2, the slice for index 1 would have to finish fetching before the slice for index 2 could start being fetched.

Instead, I think the debouncing needs to happen at the fetching level, within the providers themselves, perhaps with some automatic-cancellation of stale requests as well. We ruled this out at the time because it seemed to require significant refactoring, including maybe replacing our current suspense-based fetching library, react-suspense-fetch.

I've been meaning to switch to a more modern suspense fetching library for a long time. This sort of problem is typically what those libraries are for, so it seems a shame to reinvent the wheel.

I would also like to take better advantage of React's concurrent features (useTransition, useDeferredValue, etc.), for instance to avoid showing the Suspense loading fallback while fetching new slices. After all, it is React's job to keep the UI responsive, so we shouldn't call requestAnimationFrame ourselves if we can avoid it.

Obviously, this refactoring of the providers and fetching library is not going to happen next week, so in the meantime, if you can think of another workaround that accelerates h5wasm without slowing down h5grove, please don't hesitate to share it!

bmaranville commented 4 months ago

The debouncing algorithm is designed only to prevent an event from being fired too frequently - there is no affordance in the algorithm for trying to make it as reponsive as possible.

I would argue that even for h5grove, you would be better off not using debouncing. In the scenario proposed above, where a user has moved the slider from 0 to 1 to 2, there would be no benefit to the user from starting a fetch of slice 2 if the fetch/draw of slice 1 was not complete - particularly if they then move the slider to position 3 (or 15) within the same motion. It is not a good idea to stack slice fetches, even if there is some throttling done.

Unless you are going to do predictive fetching (which I am not recommending at all), I believe the way to keep the UI as lag-free as possible is to retrieve slices and plot them as soon as you can after the previous render is complete. If the plotting action in the client is taking a long time (which doesn't seem to be the case), one could make the case that fetching as soon as the previous fetch is complete makes more sense than waiting until (fetch and draw) is complete.

If there is a React animation system that exists for doing this kind of update optimization, I agree it would be easier to use that! But this type of animation does not agree well with a system in which the rendering (which in this case includes fetching or calling a slice function) is decoupled as a reactive element from the user input. You really need feedback on when the rendering is done to support a smooth update.

bmaranville commented 4 months ago

If you don't want to use requestAnimationFrame in a loop (which might not be a good fit for the architecture of h5web), what about this alternative?

  1. When the slider is moved, create a callback function which is passed all the way down to the plot renderer... every time the plot finishes rendering, it calls the callback
  2. When the callback is called, the state of the slider component is compared to the state when the last slice was initiated; if the state is different, start a new slice/draw.
  3. (this will then cause a new callback to be fired, and this repeat of 1 and 2 will continue until the user stops moving the slider)

This type of logic might be more compatible with a deeply nested reactive system, and doesn't require an animation loop.

axelboc commented 4 months ago

In the scenario proposed above, where a user has moved the slider from 0 to 1 to 2, there would be no benefit to the user from starting a fetch of slice 2 if the fetch/draw of slice 1 was not complete - particularly if they then move the slider to position 3 (or 15) within the same motion. It is not a good idea to stack slice fetches, even if there is some throttling done.

You seem to assume that, when a user goes from slice 0 to slice N, they want to see a video-like visualization of all slices between 0 and N. This is not necessarily the case.

At the ESRF, many beamlines generate HDF5 files containing stacks of tens or even hundreds of 16M-pixel images. One use case we often encounter is being able to quickly check one in every 10/100/... images to find where interesting data starts to appear. Given the size of the data, and that maybe it is being accessed remotely through a slow connection, it would not be practical to fetch 100 slices (and even worse, wait for them all to finish fetching) before the last one could be visualized... not to mention that it would be a huge waste of network and server resources.

Even with h5wasm, this scenario would be painful UX-wise, as the visualization would lag behind the slider due to the sheer size of the data and the computations required (domain computation, typed array conversions, etc.). Even worse: given that we currently don't have any mechanism to free up cached resources, the computer's memory would quickly clog up and the browser would quickly start to struggle (assuming a future WORKERFS-based implementation of the h5wasm provider that doesn't require the entire file to be loaded into memory).

That being said, I hear you: there is a need for a video-like behaviour for reasonably-sized datasets. Debouncing the slicing state is a solution that does not account for this use case and is more focused on keeping the viewer reliable regardless of dataset size and provider performance.

At some point, we actually considered adding a way for users to ask explicitly for an entire dataset to be fetched in order to get instantaneous slicing: the idea was to then disable debouncing on the slider as soon as all the slices are available in memory. Perhaps this is the way to go? We'd just have to think of a good UI.

The advantage of this solution is that it becomes the user's responsibility to decide whether the dataset can be fetched in a reasonable amount of time for them and held in memory on their machine — and if they change their mind, they can always cancel the request.

bmaranville commented 4 months ago

I agree with you! Sorry if that isn't clear from my earlier comments. I don't think you should fetch every slice as you move the slider. The algorithm I'm suggesting does not do that.

Here is an example: user has some data with 100 big images, and let's say it takes 200 ms to do a get a slice from the backend they are using. The user moves the slider smoothly from slice 0 to slice 100 in 2 seconds (50 ticks moved / second, so 20 ms per tick)

If the backend is slower than that, it will fetch less frames during the motion; if it is faster, it will fetch more. Never does it try to fetch more frames than the one it is about to show. At the end of the motion it always fetches and shows the frame the user stopped at. The largest possible delay to the user seeing the frame they stopped on is the time it takes to fetch and render the one frame preceding the end of the motion.

axelboc commented 4 months ago

Right, I get you now, but there's still something I don't quite understand. Let's imagine we're dealing with much slower load times, say 10s per slice. You're on slice 0 and it finished fetching/rendering.

Now you click and hold the slider thumb, move it to slice 2 (in a negligible amount of time) and hold it there. As per your example, this triggers a fetch for slice 1, waits for slice 1 to finish loading for 10s, renders slice 1, then looks at the slider value, then triggers a fetch for slice 2, wait for slice 2 to finish loading then renders slice 2. Result: slice 2 appears after 20s, when it could have appeared after 10s + debounce time with the current implementation.

It's the "At the end of the motion it always fetches and shows the frame the user stopped at." that I don't understand. If you disallow concurrent fetches and don't debounce the interaction, how can you guarantee that the slice the user eventually stops on gets fetched as quickly as possible?

axelboc commented 4 months ago

Right, it's what you say in your last sentence: "The largest possible delay to the user seeing the frame they stopped on is the time it takes to fetch and render the one frame preceding the end of the motion."

Unfortunately, to me, this is not an acceptable compromise when dealing with very large images and network-based providers.

It doesn't mean that we can't find a middleground somewhere, though.

bmaranville commented 4 months ago

Ok, how about one more addition to the algorithm: The stop event (onAfterChange?) on the slider cancels any previous (fetch+display) and starts a new one at the stop value. This would eliminate the extra frame lag on the client.

It does add complexity in that the providers' slice functions would have to be modified so they can be aborted (unless that is already done)

You could also just start a new fetch on the stop event, and ignore the result of the previous fetch instead of canceling it, but this would mean up to 2 fetches could be active on the network at the same time, if that is a resource concern.

axelboc commented 4 months ago

Yeah, only downside is, you have to actually release the mouse pointer for onAfterChange to be triggered. To be honest, I think we need a combination of all the above:

  1. trigger as many consecutive fetches as the provider can handle while slicing;
  2. trigger a fetch right away when onAfterChange is called;
  3. trigger a fetch when the user holds the thumb on a given slice for some time (say 250ms).

For 2 and 3, maybe cancel any other ongoing fetch, though not sure it would make a difference and we may actually prefer to see the result of this ongoing fetch until the final fetch completes.

axelboc commented 4 months ago

Attempt #1583

Results

Small dataset

Try, for instance, the /entry_0000/3_time_average/results/intensity_error dataset in the water_224.h5 demo file with the Line visualization, with h5grove or h5wasm.

Very satisfactory results with both providers. Slicing is very smooth; the visualization updates very quickly throughout the slicing motion. You can barely tell that the change events are throttled, yet the throttling does limit the number of requests sent to h5grove. Requests do get backed up from time to time with h5grove, but given the small size of each slice and the responsiveness of the server, they do not stay backed up for long; regardless, the more slices are cached, the smoother the slicing becomes.

Very large dataset

You can find an open dataset for testing with h5wasm on the ESRF Data Portal. Log in as anonymous, expand the first dataset in the list (magnifier icon on the left), select the "Files" tab, and download the first HDF5 file:

image

For testing with h5grove, put VITE_H5GROVE_URL="https://bosquet.silx.org/h5grove" in your .env.local file, and add /h5grove?file=_Stru15-240228_016-04_1_1_data_000001.h5 to your localhost URL.

The UI lags heavily when slicing through the stack of very large images (100 x 4362 x 4148), especially with h5wasm. This also happens with h5grove, but since each slice is so slow to fetch, it lags less often. This is of course because the main thread gets blocked every time a slice gets rendered (due to domain computation, typed array conversion, etc.). There's nothing React's concurrent rendering can do about it. When the dimMapping state was debounced, the problem was mitigated, since the visualisation would re-render only once the user was likely to have finished interacting with the slider.

Obviously, with h5grove, the throttling is not the perfect solution either, since requests still get backed up and can slow down the response of the final slice. In the PR, you'll see that I've experimented with cancelling previous ongoing requests before setting dimMappingState. Unfortunately, this solves the problem only partially, since the only thing that gets cancelled is the data transfer; the server still handles the requests, reads the slices from the file, etc. That being said, since the UI lags quite a bit due to the size of the images, there's actually a lot less requests that get sent than when slicing through a small dataset, so it's not actually that big of a slow down.


So this was a first attempt with React concurrent mode and some basic throttling. Not quite like the callback solution you had in mind, but I think it demonstrates issues that would likely also appear with such a solution. I'd be very grateful to be proven wrong, though, so if you manage to prototype something that behaves better than this attempt with large datasets, I'd love to see it!

Main positive take away for me so far is that concurrent mode is awesome and really improves the UX of the slicing, even with the current debounce. Not having the visualization disappear every time the slider moves would make a huge difference already.

bmaranville commented 4 months ago

Trying it out now... thanks for the detailed directions for getting files from the ESRF portal!

axelboc commented 2 months ago

Latest progress:

axelboc commented 1 month ago

I've merged #1656, but do report back when you get the time, @bmaranville:

axelboc commented 1 month ago

I've reverted #1656 in #1657, as I'd like to try one more approach before settling on viewerConfig.slicingTiming. Sorry for the back-and-forth and all the meanderings... I just want to get this right. :sweat_smile: