Open bmaranville opened 8 months ago
Here is a demo with debouncing turned off:
https://github.com/silx-kit/h5web/assets/686570/4d791ea8-c83f-49c7-a1e5-a16caccbb209
Great call @bmaranville, I wholly agree.
The debounce indeed only makes sense when using providers that make requests for each slice such as H5Grove or HSDS. It only gets in the way of responsiveness when using the H5Wasm provider.
I agree that the debouncing is not ideal. Our main goal at the time was to avoid spamming h5grove with concurrent requests while keeping the slider responsive. I think this was before h5wasm came along.
If I'm not mistaken, the solution you suggest, while logical for h5wasm, has the major downside that it would slow down slicing with h5grove: when the slider moves from index 0 to 1 and then to index 2, the slice for index 1 would have to finish fetching before the slice for index 2 could start being fetched.
Instead, I think the debouncing needs to happen at the fetching level, within the providers themselves, perhaps with some automatic-cancellation of stale requests as well. We ruled this out at the time because it seemed to require significant refactoring, including maybe replacing our current suspense-based fetching library, react-suspense-fetch
.
I've been meaning to switch to a more modern suspense fetching library for a long time. This sort of problem is typically what those libraries are for, so it seems a shame to reinvent the wheel.
I would also like to take better advantage of React's concurrent features (useTransition
, useDeferredValue
, etc.), for instance to avoid showing the Suspense
loading fallback while fetching new slices. After all, it is React's job to keep the UI responsive, so we shouldn't call requestAnimationFrame
ourselves if we can avoid it.
Obviously, this refactoring of the providers and fetching library is not going to happen next week, so in the meantime, if you can think of another workaround that accelerates h5wasm without slowing down h5grove, please don't hesitate to share it!
The debouncing algorithm is designed only to prevent an event from being fired too frequently - there is no affordance in the algorithm for trying to make it as reponsive as possible.
I would argue that even for h5grove, you would be better off not using debouncing. In the scenario proposed above, where a user has moved the slider from 0 to 1 to 2, there would be no benefit to the user from starting a fetch of slice 2 if the fetch/draw of slice 1 was not complete - particularly if they then move the slider to position 3 (or 15) within the same motion. It is not a good idea to stack slice fetches, even if there is some throttling done.
Unless you are going to do predictive fetching (which I am not recommending at all), I believe the way to keep the UI as lag-free as possible is to retrieve slices and plot them as soon as you can after the previous render is complete. If the plotting action in the client is taking a long time (which doesn't seem to be the case), one could make the case that fetching as soon as the previous fetch is complete makes more sense than waiting until (fetch and draw) is complete.
If there is a React animation system that exists for doing this kind of update optimization, I agree it would be easier to use that! But this type of animation does not agree well with a system in which the rendering (which in this case includes fetching or calling a slice function) is decoupled as a reactive element from the user input. You really need feedback on when the rendering is done to support a smooth update.
If you don't want to use requestAnimationFrame in a loop (which might not be a good fit for the architecture of h5web), what about this alternative?
This type of logic might be more compatible with a deeply nested reactive system, and doesn't require an animation loop.
In the scenario proposed above, where a user has moved the slider from 0 to 1 to 2, there would be no benefit to the user from starting a fetch of slice 2 if the fetch/draw of slice 1 was not complete - particularly if they then move the slider to position 3 (or 15) within the same motion. It is not a good idea to stack slice fetches, even if there is some throttling done.
You seem to assume that, when a user goes from slice 0 to slice N, they want to see a video-like visualization of all slices between 0 and N. This is not necessarily the case.
At the ESRF, many beamlines generate HDF5 files containing stacks of tens or even hundreds of 16M-pixel images. One use case we often encounter is being able to quickly check one in every 10/100/... images to find where interesting data starts to appear. Given the size of the data, and that maybe it is being accessed remotely through a slow connection, it would not be practical to fetch 100 slices (and even worse, wait for them all to finish fetching) before the last one could be visualized... not to mention that it would be a huge waste of network and server resources.
Even with h5wasm, this scenario would be painful UX-wise, as the visualization would lag behind the slider due to the sheer size of the data and the computations required (domain computation, typed array conversions, etc.). Even worse: given that we currently don't have any mechanism to free up cached resources, the computer's memory would quickly clog up and the browser would quickly start to struggle (assuming a future WORKERFS-based implementation of the h5wasm provider that doesn't require the entire file to be loaded into memory).
That being said, I hear you: there is a need for a video-like behaviour for reasonably-sized datasets. Debouncing the slicing state is a solution that does not account for this use case and is more focused on keeping the viewer reliable regardless of dataset size and provider performance.
At some point, we actually considered adding a way for users to ask explicitly for an entire dataset to be fetched in order to get instantaneous slicing: the idea was to then disable debouncing on the slider as soon as all the slices are available in memory. Perhaps this is the way to go? We'd just have to think of a good UI.
The advantage of this solution is that it becomes the user's responsibility to decide whether the dataset can be fetched in a reasonable amount of time for them and held in memory on their machine — and if they change their mind, they can always cancel the request.
I agree with you! Sorry if that isn't clear from my earlier comments. I don't think you should fetch every slice as you move the slider. The algorithm I'm suggesting does not do that.
Here is an example: user has some data with 100 big images, and let's say it takes 200 ms to do a get a slice from the backend they are using. The user moves the slider smoothly from slice 0 to slice 100 in 2 seconds (50 ticks moved / second, so 20 ms per tick)
If the backend is slower than that, it will fetch less frames during the motion; if it is faster, it will fetch more. Never does it try to fetch more frames than the one it is about to show. At the end of the motion it always fetches and shows the frame the user stopped at. The largest possible delay to the user seeing the frame they stopped on is the time it takes to fetch and render the one frame preceding the end of the motion.
Right, I get you now, but there's still something I don't quite understand. Let's imagine we're dealing with much slower load times, say 10s per slice. You're on slice 0 and it finished fetching/rendering.
Now you click and hold the slider thumb, move it to slice 2 (in a negligible amount of time) and hold it there. As per your example, this triggers a fetch for slice 1, waits for slice 1 to finish loading for 10s, renders slice 1, then looks at the slider value, then triggers a fetch for slice 2, wait for slice 2 to finish loading then renders slice 2. Result: slice 2 appears after 20s, when it could have appeared after 10s + debounce time with the current implementation.
It's the "At the end of the motion it always fetches and shows the frame the user stopped at." that I don't understand. If you disallow concurrent fetches and don't debounce the interaction, how can you guarantee that the slice the user eventually stops on gets fetched as quickly as possible?
Right, it's what you say in your last sentence: "The largest possible delay to the user seeing the frame they stopped on is the time it takes to fetch and render the one frame preceding the end of the motion."
Unfortunately, to me, this is not an acceptable compromise when dealing with very large images and network-based providers.
It doesn't mean that we can't find a middleground somewhere, though.
Ok, how about one more addition to the algorithm:
The stop event (onAfterChange
?) on the slider cancels any previous (fetch+display) and starts a new one at the stop value. This would eliminate the extra frame lag on the client.
It does add complexity in that the providers' slice functions would have to be modified so they can be aborted (unless that is already done)
You could also just start a new fetch on the stop event, and ignore the result of the previous fetch instead of canceling it, but this would mean up to 2 fetches could be active on the network at the same time, if that is a resource concern.
Yeah, only downside is, you have to actually release the mouse pointer for onAfterChange
to be triggered. To be honest, I think we need a combination of all the above:
onAfterChange
is called;For 2 and 3, maybe cancel any other ongoing fetch, though not sure it would make a difference and we may actually prefer to see the result of this ongoing fetch until the final fetch completes.
dimMapping
state.useDeferredValue
on the dimMapping
state to opt into React's concurrent rendering. This means that the suspense fallback (ValueLoader
) no longer appears while slicing. Instead, I use the new isStale
boolean to add "(loading)" to the title of the line and heatmap plots to indicate when they are waiting to be updated.Try, for instance, the /entry_0000/3_time_average/results/intensity_error
dataset in the water_224.h5
demo file with the Line visualization, with h5grove or h5wasm.
Very satisfactory results with both providers. Slicing is very smooth; the visualization updates very quickly throughout the slicing motion. You can barely tell that the change events are throttled, yet the throttling does limit the number of requests sent to h5grove. Requests do get backed up from time to time with h5grove, but given the small size of each slice and the responsiveness of the server, they do not stay backed up for long; regardless, the more slices are cached, the smoother the slicing becomes.
You can find an open dataset for testing with h5wasm on the ESRF Data Portal. Log in as anonymous, expand the first dataset in the list (magnifier icon on the left), select the "Files" tab, and download the first HDF5 file:
For testing with h5grove, put VITE_H5GROVE_URL="https://bosquet.silx.org/h5grove"
in your .env.local
file, and add /h5grove?file=_Stru15-240228_016-04_1_1_data_000001.h5
to your localhost URL.
The UI lags heavily when slicing through the stack of very large images (100 x 4362 x 4148), especially with h5wasm. This also happens with h5grove, but since each slice is so slow to fetch, it lags less often. This is of course because the main thread gets blocked every time a slice gets rendered (due to domain computation, typed array conversion, etc.). There's nothing React's concurrent rendering can do about it. When the dimMapping
state was debounced, the problem was mitigated, since the visualisation would re-render only once the user was likely to have finished interacting with the slider.
Obviously, with h5grove, the throttling is not the perfect solution either, since requests still get backed up and can slow down the response of the final slice. In the PR, you'll see that I've experimented with cancelling previous ongoing requests before setting dimMappingState
. Unfortunately, this solves the problem only partially, since the only thing that gets cancelled is the data transfer; the server still handles the requests, reads the slices from the file, etc. That being said, since the UI lags quite a bit due to the size of the images, there's actually a lot less requests that get sent than when slicing through a small dataset, so it's not actually that big of a slow down.
So this was a first attempt with React concurrent mode and some basic throttling. Not quite like the callback solution you had in mind, but I think it demonstrates issues that would likely also appear with such a solution. I'd be very grateful to be proven wrong, though, so if you manage to prototype something that behaves better than this attempt with large datasets, I'd love to see it!
Main positive take away for me so far is that concurrent mode is awesome and really improves the UX of the slicing, even with the current debounce. Not having the visualization disappear every time the slider moves would make a huge difference already.
Trying it out now... thanks for the detailed directions for getting files from the ESRF portal!
Latest progress:
useDeferredValue
(i.e. concurrent mode) to not show the suspense loading animation when slicing but couldn't come up with a satisfactory loading UI to replace it. I've tried adding a progress bar in various places, but it always felt too disconnected to the visualization itself. The suspense loader really gives me the reassurance that the visualization I'm seeing corresponds to the dimension mapper state. Back when we started developing H5Web, we had an animated vale on top of the visualization while loading; this would probably work better than a progress bar, but it would not feel as smooth, which is the point of this issue. All in all, this makes me think that concurrent mode is not part of the solution, and that it's actually more of a prefetching problem. EDIT 1: I'll add also that not showing the suspense fallback means users can't cancel long fetches; we would have to find an alternative UI for that as well. EDIT 2: I did find a loading UI I liked better than a progress bar: a background pattern on the whole vis area, behind the canvas (only obscured by the plot/heatmap itself since the canvas is transparent); with a subtle fade in/out animation, it looks quite nice. The problem of cancelling/retrying remains, though.react-suspense-fetch
, I personally think it could work nicely.I've merged #1656, but do report back when you get the time, @bmaranville:
slicingTiming
value enough to get the behaviour you want?I've reverted #1656 in #1657, as I'd like to try one more approach before settling on viewerConfig.slicingTiming
. Sorry for the back-and-forth and all the meanderings... I just want to get this right. :sweat_smile:
Is your feature request related to a problem?
When using the slicing slider (in DimensionMapper) for heatmaps, the plot is not updated smoothly when the slider is moved, even when using h5wasm provider (which should allow very fast updates).
The debounce that is applied is preventing updates to the plot.
Requested solution or feature
Instead of debouncing the slider signal, I think a different approach is merited here and would provide huge gains in responsiveness:
The advantage of this system over debounce is that it updates as fast as the (data slice + plot) operation can be carried out, rather than arbitrarily delaying execution of the slice by the debounce timers. There is natural rate-limiting to slice requests from e.g. H5Grove because no more than one request will be issued at a time. Providers with very fast slice turnarounds (e.g. h5wasm) will be able to update nearly instantly (smooth visual scrolling through datasets!)
Alternatives you've considered
Additional context
I have implemented this type of buffering on many occasions for smooth updates of visualizations from user inputs. I'm happy to help with this.
I couldn't find a nice requestAnimationFrame equivalent to the
react-hookz/web
debounce tools being used, so it might have to be coded by hand rather than using a library.