leeoniya / uPlot

šŸ“ˆ A small, fast chart for time series, lines, areas, ohlc & bars
MIT License
8.51k stars 371 forks source link

Add api for adding data to a chart without doing a full copy #890

Closed haakonsmith closed 6 months ago

haakonsmith commented 6 months ago

Hey, so I've been using uPlot to build a streaming graph, and it's been impressively fast and really awesome to use.

However, I've looked at your examples and all of them use setData, and they reset the entire chart each frame, from looking at the code I realised that setData copies the entire data array. My use case had about 60 series on the chart, each being sampled at roughly 15hz and displaying at maximum the last 20 seconds of data. This results in a data set that is 15hz 20s=300 points per series, with 60 series is 18,000 javascript numbers, at 64bits (8 bytes) per number, that means that I was trying to copy about 8 18,000=144,000bytes = 140kb every time the chart updated (I think my math is correct šŸ˜…). This works well enough on fast computers, but testing on a computer with a ram speed of 2660MT/s, it resulted in the below performance charts on chrome:

trace_outline

image

In task manager it was displaying hard faults:

image

All of this leads me to believe that I was outrunning that computers ram speed, I fixed the problem by doing a number of things to prevent garbage being generated (gotta love javascript), but one of the biggest causes seemed to be plot.setData.

I've read in various other issues that I've been looking at that the reason you do a full copy in plot.setData is because you set some defaults and mutate the data array, I've also read some stuff on why you haven't added an addData method so we can avoid copying the whole array, I tried forking the project and adding that in, but couldn't get it to work. So what I went for a simple solution of just adding a flag to setData to avoid copying:

    function setData(_data, _resetScales, _doCopy = true) {
        data = _data == null
            ? []
            : _doCopy
                ? copy(_data, fastIsObj)
                : _data;

Can see this in a fork I made, sorry about the big git diff, I ran prettier: https://github.com/haakonsmith/uPlot/tree/add-option-for-zero-copy-setdata

Anyway, all that is to say I think that having an addData api seems like the best option, but if that doesn't work maybe we could add the doCopy api and just add a bunch of warnings around it or something.

leeoniya commented 6 months ago

thanks for the details.

are you certain that your modifications helped?

i'm rather skeptical that copy() is the bottleneck here, since it is a shallow copy and is extremely cheap. it does not actually copy each of the 18k values, but rather creates new arrays/pointers that reuse the same memory block (since JS is copy-on-write)

i took a 20s profile of a 60hz 6 series 600 points and copy() was just 14ms of it. the data rate is pretty comparable (3600 60hz vs 18000 15hz)

did you try everything in https://github.com/leeoniya/uPlot#unclog-your-rendering-pipeline ?

image

haakonsmith commented 6 months ago

I'm pretty sure that changing the copy code did improve performance. Keep in mind it wasn't a cpu bound task but what I think I was observing was a memory speed bound task, however I'll try and make a reproducible example test it on the computer that was experiencing the issue and get back to you

leeoniya commented 6 months ago

so, it looks like you're right and i'm wrong. the code below uses 13M before the slices, and 129M after.

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>1M</title>
  </head>
  <body>
    <script>
      setTimeout(() => {
        let len = 1e6;
        let arr = Array(len);

        for (let i = 0; i < len; i++) {
          arr[i] = Math.random() * 1000;
        }

        let copies = [];

        setTimeout(() => {
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());
          copies.push(arr.slice());

          console.log(copies.length);
        }, 1000);
      }, 1000);
    </script>
  </body>
</html>
leeoniya commented 6 months ago

pushed a commit to stop copying data unconditionally, since it is only needed with ordinal x scales, and even with ordinal scales data only needs to be shallow-sliced, not copied.

before: image

after: image

haakonsmith commented 6 months ago

Just tested this fix on my app, and it's made performance wins across the board when dealing with large data sets! Cheers mate!