leeoniya / uPlot

📈 A small, fast chart for time series, lines, areas, ohlc & bars
MIT License
8.51k stars 371 forks source link

nano second plots #111

Closed ryantxu closed 4 years ago

ryantxu commented 4 years ago

One blocker with most javascript ploting libraries is when timeseries are forced to use Date() everything gets weird with >1ms resolution.

Influxdata supports nanoseconds by modifying dygraphs to use nano-date when required.

Do you see any issues trying to use nano-date for the x axis? Any pointers for trying a proof-of-concept?

leeoniya commented 4 years ago

as long as nano-date is compatible with the native Date (including Intl's {timezone} constructor option, which uPlot uses for DST and timezone handling), then it should be easy to make a few tweaks so an alternate date constructor can be set via opts.

in reality though, i don't see a point in using Date for that kind of precision (even periods < 1s) - i mention this in the docs [1]. you can switch to treating x as numbers via {scales.x.time: false} though not on the fly.

it might be interesting to auto-switch internally to treating x as numbers (besides the first tick) if the zoomed range of the graph is < 1s.

[1] https://github.com/leeoniya/uPlot#data-format

leeoniya commented 4 years ago

nano-date appears to require strings passed to its constructors, i'm not sure how uPlot can handle this, since it cannot ingest Date objects, and the internals operate directly on the numbers, only creating Date objects as necessary for formatting or finding temporal/calendar breakpoints. for nano-date to be usable, uPlot would have to accept Date objects and would be unable to perform math on them, which it does extensively.

there might be some path to handling calendar-honoring nanosecond timestamps if they were provided as native BigInts, then i can split the calendar-relevant portion out, do the math on that, and do something else with the rest for display.

it's one of those things that is both technically challenging, and not entirely necessary. no one cares about the date on the nanosecond scale. maybe GPS satellites that need to sync with cesium clocks. every other use case just needs to show some base reference for where the chart begins 2020-01-31 06:15:22.560 and then the decimal tick offsets from there. any kind of date formatting is pointless at this resolution.

it's possible to customize the displayed tick labels so that the first one shows the starting point and the rest are just numeric offsets from that base. i will work on adding an option to allow dynamic time: true toggling.

if you have some sample data at nanosecond precision, you can attach it here so we can work out a PoC.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt

leeoniya commented 4 years ago

no matter how i twist my brain, i cannot imagine a scenario where the calendar date would be a relevant part of a chart (maybe besides the chart title) at scales smaller than 1ms. if you avoid counting nanoseconds since 1970, then js Number has more than enough precision to show everything as offsets from some more logical prior temporal marker. itcs great that influxdb stores its time values in ns timestamps, but i cannot find a single ns (or even ms) resolution chart on the internet that even mentions a calendar date. i googled "nanosecond scale plot" in image search to no avail.

if the whole purpose of this excercise simply to ingest raw influxdb data and avoid preprocessing, then this is not something i am interested in.

if you can show me a real case (screenshot) when you need to see calendar dates in the tick values or hover legend at ns resolution, then we can ponder further.

as mentioned previously, i'll see what can be done about toggling time: true on-the-fly without recreating the chart.

ryantxu commented 4 years ago

It seems the best approach for this might be something like - when nano seconds are needed:

I have no problem pre-processing... just want a consistent way to have labels on the time axis.

leeoniya commented 4 years ago

mark the "start" time

what would this look like? similar to what's done now for ms?

 :51.400               :51.600               :51.800
9/8 2:38pm

so this?

    .123456789        .123456790        .123456791
2020-01-01 06:25:53am
          7.89e-7                     7.90e-7                     7.91e-7
2020-01-01 06:25:53.123456am
:53.123456789         7.90e-7        7.91e-7
01/01 6:25am
:53.123456789   .123456790   .123456791
01/01 6:25am
  .123456789      .123456790      .123456791
01/01 6:25:53am

there's a high risk that the starting label will be long enough to collide with something, or overlap an adjacent tick, or extend beyond the chart bounds.

a mockup of what you're imagining would be helpful.

ryantxu commented 4 years ago

what would this look like? similar to what's done now for ms?

yes :)

Something like: 31689749-4acb850a-b345-11e7-8cb4-7287345836cf

I don't think it needs full date resolution -- just that the values are relative to the clock, not an arbitrary start point. If it crosses a second/min line ideally we can indicate that.

leeoniya commented 4 years ago

this demo doesn't tell me much :(

if those are seconds in your gif, then what you're showing already works and provides proper temporal breakpoints. the above demo has no reference ticks, so 39.100 can just be non-temporal numbers for all i know (from some 0 baseline). it looks like only the tooltip contains any kind of contextual assistance.

i'm interested in what you're expecting at nanosecond resolution. would the ticks be as above (39.123456789)? would uPlot's reference ticks go away, then?

ryantxu commented 4 years ago

then what you're showing already works and provides proper temporal breakpoints.

how? that really is all we need. We can get full date resolution in a tooltip (outside uPlot)

would uPlot's reference ticks go away, then?

is that the 2nd line of time at the bottom? I don't think it is necessary

leeoniya commented 4 years ago

i've pushed some changes that improve the maximum tick resolution on numeric x scales. you can now zoom to 12 decimal resolution (given smallish integers, of course). you can try zooming [1].

so my initial thought was to allow manually switching from from time: true to time: false via setScale('x', {time: false, min: 0.1234 max: 0.1235}) or something. by disabling built-in zoom and relying on setSelect, it would be possible to fetch ns timestamps, strip the integer and call setScale() & setData() with the decimal portion.

turned out there are a couple issues here. even if i were to pull out and re-initialize the time-dependent properties of series & axes along the x scale (search isTime in Line.js), there would be no ability to provide any custom numeric series.value and axis.values formatters - they would just have to fall back to uPlot's defaults, which would trigger immediate requests to customize those as well as part of the time: true/false switch. now we're talking essentially about a full chart re-init, or the ability to change any options dynamically at any time.

this would be a huge change with a lot of extra code and delicate complexity; it would be far more complex than simply initializing a second numeric chart (either eagerly or lazily) and using setSelect to switch visibility between the two based on the requested zoom range. uPlot is already the most memory efficient chart lib, so there should be no perf issues. yes, it's a bit of a hack, but i'm not prepared to uproot uPlot to accommodate this when a userland solution is both fast and simple.

i can make a jsfiddle sample of this if you'd like.

[1] https://leeoniya.github.io/uPlot/demos/axis-control.html

ryantxu commented 4 years ago

thank you -- the proposal makes sense, and for the cases where this is actually needed the "hack" seems totally reasonable.

I totally appreciate your effort to balance these edge cases against the need for a solid core.

i can make a jsfiddle sample of this if you'd like.

yes please 🙏

leeoniya commented 4 years ago

so there's a bunch of imperfect mocking & data-gen going on in here, but i think it gets the idea across:

https://codepen.io/leeoniya/pen/eYmqGBy

the idea is basically once you intend to zoom to a range that will require sub-ms resolution ticks, you send the small-int/decimal part of the ns timestamps to the numeric plot and switch to it, otherwise send the ms-resolution float timestamps to the temporal plot and switch to it. the actual logic and switching conditions will depend heavily on the data itself and how it gets aggregated by the server, and when.

there's other stuff not finished there, like probably a missing setCursor() call on the switched-to plot, at the same coords as the switched-from plot.