Closed alexeyraspopov closed 2 years ago
This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.
🔍 Inspect: https://vercel.com/nteract/semiotic/HEzd7SxcgB1poqqFdR3QbCnkhkU8
✅ Preview: https://semiotic-git-batchwork-nteract.vercel.app
Back when I was working on #547 I started thinking about an alternative solution for batching, that would potentially eliminate the root cause of bugs in renderQueue and simplify the batching overall. In this PR I'm trying to apply the solution I've been using before for some SVG shenanigans. The way it is implemented, we can possibly use it for some other tasks if necessary, not just canvas rendering.
One of the reasons to make this refactoring now is that
VisualizationLayer
is a class component and switching to this new implementation will make it easier to convert the component to function.Context
The context of this PR is
<VisualizationLayer />
's approach to render data on canvas. This approach allows rendering considerably larger number of datapoints, given that there is no need to add new DOM nodes. Even though the cost of rendering a single data point is smaller for canvas, we need to take into account possible amounts of data that needs to be processed. Canvas rendering itself is sync process which means we can easily get into state where rendering all datapoints on canvas takes more time than a single frame. Worst case scenario, the webpage can freeze due to main thread working without yielding back to browser engine.This is where batching comes into play. In order to ensure that the page does not freeze during render, let's slice the dataset into chunks, render them one by one, making sure the main thread yields back to browser in between those chunks. Eventually we'll get all data rendered without affecting UX. Worst case scenario is that the rendering may take a couple of frames, which still most likely won't be "visible" to the user, or at least won't be a deal breaker. In any way, this is still cheaper than rendering the same amount of data in SVG.
Problem
Current solution for batching in
VisualizationLayer
is implemented asrenderQueue
class. The class is quite flexible and dictates particular workflow, however just a single use case is being used in the component. The class is pretty straightforward when it comes to batching but it has several significant flaws. One of them was already fixed in #547. Another one is about the fact the class slices the original dataset, allocating memory chunks for every single batch of work. Even if this doesn't allocate much memory overall, inevitable garbage collection cycles may slow down the rendering process, introducing skipped frames and extending time to complete overall. One more issue I found in process was the fact thatVisualizationLayer
does not attempt to cancel existing renderQueue when unmounting. Even though it is a rare case to hit, the fact it is possible is something I'd like to fix.batchWork()
This PR introduces a new internal utility
batchWork()
. The function receives a routine that needs to be batched, and it expects that the routine returns a boolean value which would define if batching needs to continue.batchWork()
controls how often the work needs to be done. If the routine is fast, it can be invoked several times during a single frame, otherwise the utility usesrequestAnimationFrame()
to schedule the following batch of work.Here's the algorithm in pseudo-code:
The utility does not hold any additional state, so the work routine must do so in closure.
The utility returns a promise that resolves when the work is done and rejects when any call of routine throws an exception.
The utility receives additional option
timeFrameMs
that can configure how often the work should yield. The default time frame is30
ms which is around 2 frames in best conditions.VisualizationLayer
uses default value but we can modify it after additional testing.Since there is just no way to predict how long the work can take and something else can run in between the batches, we need to be ready to cancel the work in progress in some particular cases (e.g. the data viz component being unmounted).
batchWork()
receives additional optionsignal
https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal that is used to verify that the work wasn't aborted before running the next batch of work.AbortController has decent browser support, even larger than ResizeObserver.
VisualizationLayer
changesThe implemented utility is getting used in
VisualizationLayer
for rendering data points on canvas. There is one more trick required to make it work though. By itselfbatchWork()
is not aware of the notion of datasets, queue, or anything related to the amount of work. It only exists to control time. The visualization component, knowing about the amount of data that needs to be rendered, needs to make some smaller chunks to ensure the batching works. Otherwise we won't get any benefits from the batching utility, if we gonna do something likeallData.forEach(datum => render(datum))
. The way howrenderQueue
does it, it just cuts small slices from the dataset, 1k items each and renders them separately. InVisualizationLayer.ts
I've implementedbatchCollectionWork()
that makes use ofbatchWork()
to run rendering function while iterating over the target dataset. The way it does it, doesn't require allocating memory for slices, it simply moves a pointer.https://github.com/nteract/semiotic/blob/f56dbde5382cfd7dd1c381d3b96dce5beb182a53/src/components/VisualizationLayer.tsx#L567-L579
Besides switching from
renderQueue
tobatchWork
, I also fixed the use ofdisableProgressiveRendering
(https://github.com/nteract/semiotic/commit/36e61f294bfc38b1ab8cb7788c9720ef93ccf724): previously the sync rendering didn't really happen as there was no call for actual render to happen.VisualizationLayer
now also includes an AbortController that is used along withbatchWork()
. The use of it in a class component may seem awkward, but it will get better whenVisualizationLayer
is converted to a function component: