swimlane / ngx-charts

:bar_chart: Declarative Charting Framework for Angular
https://swimlane.github.io/ngx-charts/
MIT License
4.3k stars 1.15k forks source link

Path simplification to improve performance for lots of data #471

Open dchacke opened 7 years ago

dchacke commented 7 years ago

I'm submitting a ... (check one with "x")

[ ] bug report => search github for a similar issue or PR before submitting
[x] feature request
[ ] support request => Please do not submit support request here

I'd like to have the option to simplify lines. (This could be extended to all shapes.)

Reproduction of the problem As data is appended to a chart, the path attribute of particular DOM elements gets longer and longer. This results in heavy load on the browser which slows down the page more and more over time.

What is the motivation / use case for changing the behavior? Path simplification is efficient and there is virtually no reduction of fidelity, i.e. lines can be drawn with much fewer data points, but look just the same. It would improve performance of this library as the amount of data points to be drawn increases.

I am currently on:

There are some libraries that solve path simplification given an array of coordinates:

marjan-georgiev commented 7 years ago

That's an interesting idea.

Just to note, the rendering glitch that can be seen on the punkr bellow is not a performance one. We use a dash-array animation to animate drawing of the line, but that line is very long so the dash-array spacing simply needs to be increased. https://plnkr.co/edit/hTPaexbMyYbX8xJC38Ja?p=preview

My idea to handle this was to measure the length of the line before the animation starts and set the dash-array offset accordingly.

Interestingly enough, I think this idea of shortening the line will take care of that issue.

dchacke commented 7 years ago

I was not aware of the issue you mentioned.

To be clear, this feature should be optional. One may not want to simplify a line that has few data points, or needs to be high fidelity, or both.

The epsilon passed in to reduce the points should also be adjustable (just like on http://mourner.github.io/simplify-js/).

The one thing preventing me from using your library at the moment is really only performance. Everything else looks great. My use case is having 10 or more charts open per page, each of them appending data once a second. That really pushes your browser to its limits.

If you add path simplification and couple this with support for toggling off all animations, you are really ahead of the competition. I am not aware of another charting library that can do all of these things, and I'm afraid I have searched desperately.

dchacke commented 7 years ago

There's also this: https://github.com/tomgp/simplify-line/

But it's old.

marjan-georgiev commented 7 years ago

Could you please give me more details on which charts you use, what your data looks like (how many series, and how many data entries per series)? A plunkr with the setup you have would be of huge value.

We just released 6.0.0 which has a big performance improvement for line and area charts. I'd like to find out what is causing the issues in your case.

dchacke commented 7 years ago

Here's an example of a line chart: https://plnkr.co/edit/OtUGfNA4CB4TuQRcYPDg?p=preview

These are just two straight lines with 3600 data points each. It may be Plunkr, but my browser is becoming very slow under the load. (I don't blame it, it's a lot of DOM elements). Hovering over the chart doesn't really work well anymore, and initial chart load time has gone up, too.

If you were to try adding new data to these lines every second (my use case in real life), I'm afraid it would get even more janky.

Two series with 3600 data points each is still a somewhat benign use case. I could see myself having up to 5 series with twice as many data points per chart. Oh, and ten of these charts on one page that I need to be able to drag around smoothly as they're drawing :)

The point is, with path simplification, the number of coordinates drawn is significantly less than those passed in. In most cases, the data can keep growing, while the number of coordinates drawn stays more or less the same!

In this particular Plunkr, path simplification would probably reduce each line to just two data points - one in the lower left, one in the upper right. The result might look something like this: https://plnkr.co/edit/OQRT8uWUG0jrsrQ5Ur3d?p=preview

This is an extreme example for the purposes of illustration - the epsilon could be adjusted also, giving the user power as to how many points should be drawn for both lines. In addition, if the user were to resize the chart's container, making the chart bigger, it would also show more data points than for a smaller chart.

More complex shapes obviously couldn't be reduced as much, but I hope this gets the point across. There wouldn't really be a limit anymore to how many series and how many data points you can draw.

dchacke commented 7 years ago

I made some edits to my last post, so don't trust your email on this.

marjan-georgiev commented 7 years ago

That's helpful, thanks!

It is not due to DOM elements. The number of DOM elements created depends on the number of series, not data points.

But the number of data points affects several parts of the chart:

  1. Initial data preparation and scale creation. This has a complexity of O(n). The chart needs to calculate the domains of both x and y scales, and it needs to iterate all data points for that, at least once. This increases linearly with the number of data points.
  2. Creation of the paths. D3 generates the paths for the lines, and areas by iterating through all data points. This also has a complexity of O(n).
  3. Displaying tooltip on mouse over. This is done by finding the nearest data point in the dataset. I am using binary search to do this, so the complexity is O(log n). This is not that expensive, but the algorithm is run on every mouse move event, which can be triggered a lot. We can try throttling it.

Path simplification will probably help with #1 and #2, but we'll still need to have all points for #3, so they are displayed in the tooltip.

marjan-georgiev commented 7 years ago

Also, have you tried simplifying the data prior to passing it to the chart? I think that will generate the best results.

dchacke commented 7 years ago

Yes, you are correct, I misspoke - the path attributes on the lines end up getting huge. I have verified that using path simplification from simplify-js they stay pretty small.

I could sample the data perhaps, and it's something I might do for now. Not to stress the point too much, but it might decrease the fidelity more so than path simplification would. I would also lose the ability to increase fidelity as the chart grows in size.

Regarding point 3, you could keep all points somewhere, but you might end up displaying tooltips at points where there is no line. I would personally prefer hovering only where points are drawn (ie results from path simplification), but I suppose this could also be optional.