leeoniya / uPlot

📈 A small, fast chart for time series, lines, areas, ohlc & bars
MIT License
8.51k stars 371 forks source link

Financial time series axis #51

Closed benmccann closed 4 years ago

benmccann commented 4 years ago

Adding an issue to track based off the discussion at https://github.com/chartjs/Chart.js/pull/6695#discussion_r343698429

Most financial applications such as Yahoo Finance, Trading View, etc. have each data point equally spaced apart. I originally didn't notice this wasn't done on the gold high/low example because there were lots of points. But it's more noticeable when there are fewer points or the points are drawn as candlesticks, ohlc, or bars because you end up with gaps for weekends.

The actual logic is really simple. Each point is just placed on the x-axis by index / numIndices. (The Chart.js code for this is stupidly over complicated and a terrible example to follow so I wouldn't even look at it as it would only confuse)

leeoniya commented 4 years ago

i was hoping to squeak by with something like http://dygraphs.com/gallery/#g/highlighted-weekends (https://github.com/leeoniya/uPlot/issues/27). but if it's as easy as it seems, maybe it can be baked in.

the main issue with the cheap solution is when you have actual missing data (gaps), but instead the chart makes it look misleadingly continuous. i'm not generally a huge fan of adding features which are easily abused to make easily-misinterpreted charts, why i'm staying away from stacked series [1].

EDIT: i guess if you had real missing data you'd still have a gap datapoint, so it wouldn't get squashed...

i'll think about it, though.

[1] https://github.com/leeoniya/uPlot#non-features

benmccann commented 4 years ago

In the finance case, I'm not sure if I've ever run into a case with missing data (outside of weekends and holidays, which I explicitly wanted to skip). I'm not exactly sure what that would mean. (I guess that nasdaq's whole computer system crashed?)

leeoniya commented 4 years ago

https://www.theverge.com/2017/7/3/15917950/nasdaq-nyse-stock-market-data-error

🤣

benmccann commented 4 years ago

Wrong prices I have seen many, many times! It's surprising way more common than you would ever imagine!

leeoniya commented 4 years ago

so this may turn out to be somewhat involved and perf-heavy.

the ticks are currently determined by the data window's range (max - min). if i have a range of 7 days, i know how many secs/pixel i can draw, this is used in combination with space to determine tick granularity.

if i'm skipping/not rendering some of that range, then i'm getting more pixels back by having less data. now i have to scan the data to know how much to deduct from the full range to get the correct tick granularity.

just thinking out loud, maybe i'm wrong.

benmccann commented 4 years ago

If I understand, the min / max are specified as timestamps, so you have to scan the timestamps to find the corresponding indexes and that's what's expensive? Is the min / max just used for zooming?

leeoniya commented 4 years ago

let's say the x scale has a min,max value display range of 0,10.

for a linear scale i know that the position of value 5 in the data along the scale (and canvas x) is at 50% (5/(10-0)).

now if i'm skipping 3-4 out of this range (which can only be determined by scanning the data), i have to account for this in the positional math of where 5 is now. it wont be at 50% but at 50% - accumulated skipped values prior 5. right? you effectively go from absolutely positioned values to incrementally positioned values.

this book-keeping has to be done in a bunch of places, like scale snapping, etc.

EDIT: sorry, this doesnt really answer your question.

if i have a view range of 0-10, i cannot use that full range for knowing how dense my ticks can be, because for all i know 90% of it can be skipped/squashed. maybe it's not an issue, but it will prevent higher granularity ticks when zoomed in, since it will still think the range is larger than it actually ends up being.

leeoniya commented 4 years ago

i made a separate branch for experimentation.

https://github.com/leeoniya/uPlot/commit/d3b39854760fa283460fe466a112df517fa3e28d is a userland impl of an even distribution. it basically switches the x-scale to numeric and swaps the timestamps for evenly-spaced ints. it then re-implements a basic variant of the date formatters.

you do lose some date-related affordances using this [somewhat elegant] hack, but it mostly does the job.

benmccann commented 4 years ago

What's scale snapping?

Can you get the min / max as indices instead of timestamps? I'm wondering if it's possible to do lastIndex - firstIndex to get the number of indices in the range instead of scanning to count them

leeoniya commented 4 years ago

it's essentially a padding function. by default it applies only to the y-axes.

if the min,max of all your series is determined to be 7.65456,103.222, then the scale & viewport will be snapped/rounded to some close increment like 7.5 and 105. the x-axis has no padding so that it can zoom to the precise data range, and it would be odd to snap/pad out dates before or after you actually have available data or outside of what you explicitly selected to zoom.

before i spew more crap from my head, lemme try to get it working based on index positions and see what falls out.

leeoniya commented 4 years ago

ok, so i got this mostly working. the code changes were pretty minimal.

https://github.com/leeoniya/uPlot/tree/even-x-distr-2

there's one not-so-great scale <-> data interdependency that had to be added here:

https://github.com/leeoniya/uPlot/commit/46d64d919d0252bd47ed7c009080d254a2ee61e3#diff-fbc43a77b4e8fe64398aa79145272a6bR656

this adds an implicit 1:1 associative assumption in case of scale.skip == true that did not exist before between scales and data. it's a code smell, but i can probably live with it.

there's still some resulting wonkiness around grid ticks. at some high zoom levels it's not granular enough, while it's more granular at lower zoom levels 😕

it also feels pretty odd to have an x-grid that looks evenly spaced, but dates are not. i guess this is expected though.

leeoniya commented 4 years ago

https://github.com/leeoniya/uPlot/commit/b14e1530648f56bf6c78a068ece148596ffa28a4 fixes the wonkiness but gives up logical temporal partitioning. i'm not sure there's a cheap way to make them non-mutually-exclusive.

leeoniya commented 4 years ago

ok, this is now merged into master as scale.distr = 2 (even).

the default is 1 (linear), and perhaps i'll add 3 (log) in the future as part of #29.

benmccann commented 4 years ago

Wow. Amazing! You're going to get me to switch from Chart.js soon enough :-)

leeoniya commented 4 years ago

i could always use a second capable set of eyes, brain and hands!