feature: multiple x axis to combine different time precisions

milahu commented 3 years ago

assume we have multiple time series with different time precisions / resolutions / divisions: some series are value per year, others are value per month, others are value per week

currently we must preprocess our data to match the highest value frequency (here: value per week) and repeat values with lower frequencies (annual data: same value for all 52 weeks) (please tell me im wrong ..)

here is a plot of annual and monthly values:

what is ugly here are the circles in the white annual line - they are too many

disabling the circles completely is a bad solution

possible workarounds: use nulls/gaps to encode missing values show only every N-th circle (others?) .. but these still require to merge x values into one axis

possible solution: currently, data[0] holds the x values, and all other data[i] hold y values we could introduce a mapping between arrays, mapping x values to y values or more general, map input values to output values

the default mapping would be ```js datamap: [ [0, 1], // f(d0) -> d1 [0, 2], // f(d0) -> d2 [0, 3], // f(d0) -> d3 [0, 4], // f(d0) -> d4 // .... ] ``` to combine annual and weekly x values, we could then use ```js datamap: [ [0, 1], // f(d0) -> d1 [2, 3], // f(d2) -> d3 ] ``` or plot functions with multiple inputs ```js datamap: [ [0, 1, 2], // f(d0, d1) -> d2 [3, 4], // f(d3) -> d4 ] ``` or we extend `opt.series[i]` like ```js series: [ { label: "T year", axis: 0, // x axis }, { label: "T month", axis: 0, // x axis }, { label: "N year", axis: 1, // y axis input: 0, // this is an output/value series // with series 0 (T year) as single input/key }, { label: "N month", axis: 1, // y axis input: 1, // this is an output/value series // with series 1 (T month) as single input/key }, ] ``` in the future we might need 3D plotting and MISO functions (multi input, single output) (not sure if MIMO makes much sense)

@leeoniya please share your thoughts so i can make a better PR : )

leeoniya commented 3 years ago

and repeat values with lower frequencies (annual data: same value for all 52 weeks) (please tell me im wrong ..)

you're wrong :)

you need to fill them with null and set spanGaps: true for those series.

there is now a utility function that can do this for you called uPlot.join(). see how it's used in https://github.com/leeoniya/uPlot/blob/master/demos/path-gap-clip.html#L120

milahu commented 3 years ago

as i said ..

possible workarounds: use nulls/gaps to encode missing values show only every N-th circle (others?) .. but these still require to merge x values into one axis

this still *feels* like a non-ideal solution (or is my optimization premature?)

the feature would allow a space-time tradeoff and as side-effect, allow to plot MISO functions f(d1, d2) -> d3 and: easily plot data with different x ranges

there is now a utility function that can do this for you

sweet, but on runtime, i want to do as little work as possible all my data is precompiled/cached to an optimal format so i can dynamically add/remove data with little cost

one problem: we get more snap points choice: snap to nearest value (multiple x) or show average value (one x)

leeoniya commented 3 years ago

the complexity of doing anything else will be significant. the overhead does become significant for aligning many completely unaligned datasets that are several thousand points each, but in what i think are typical cases, i've tried to make the join function as efficient as possible.

i have an unlisted synthetic demo that allows you to assess the alignment cost here: https://github.com/leeoniya/uPlot/blob/master/demos/align-data.html. i don't expect real-world cases to be random()-levels of unaligned, so it's a good stress test.

i'd be interested to see your actual datasets and how much it costs to align them.

milahu commented 3 years ago

the complexity of doing anything else will be significant.

im happy to help .. or do you mean SIGNIFICANT?

i'd be interested to see your actual datasets and how much it costs to align them.

simple, im plotting global population data over the last 500 years (or more) where datasets have different lengths and resolutions workaround: use "now" as zero index and count back (i assume plotting stops at x == data[s].length)

~~edit: this would make #107 easier to solve = plot SIMO fn f(d1) -> (d2, d3, d4)~~ naah, only useful for MISO fns

leeoniya commented 3 years ago

im happy to help .. or do you mean SIGNIFICANT?

you're welcome to help. but, yes, i think it will be very substantial as the underlying data format assumptions permeate many parts of the internals. it's not gonna be as simple as tweaking just the pathbuilder, for example. the probability of not breaking many things to get this done is basically zero, imo.

the ultimate question is, what gains do you expect in non-artificial cases, and can you prove that this will work robustly and generally. if someone tells me that they need better perf on a 10M pts scatter dataset, the simple answer is that this is not the right library to use - at some point, this just becomes true. so, it's important to evaluate real use-cases, costs and possible gains from this effort.

simple, im plotting global population data over the last 500 years (or more)

this is an arbitrary number. is that 500 datapoints? 500 52 datapoints? 500 365 24 3600 datapoints? as i said, i would be interested to see how uPlot.join() performs on your dataset, and its details.

edit: this would make #107 easier to solve

scatter/bubble is much easier (not easy) to solve with a different underlying data structure (as described there), which is the plan. if you'd like to help with that, i think it will be a much better use of time :) i don't plan to work on that for a few more months due to other work, so cannot promise a timely PR review either, unfortunately.

leeoniya commented 3 years ago

gonna close this since i don't think there is anything actionable here. feel free to follow up in this thread if you have perf issues with a specific use-case/dataset.

graphefruit commented 2 weeks ago

Hey @leeoniya, Sorry to push this topic again up. I came here explicit for this.

My use case: I develop a coffee app (https://github.com/graphefruit/Beanconqueror) which connects to different bluetooth devices. Each bluetooth devices sends in his own timestamps his data. A bluetooth scale e.g. 10 values per second, a pressure sensor 30 values per second, a temperature sensor maybe 5 values. So-> This differs into different unixtimestamps, but all shall be displayed in the same axis.

Actually I use Plotly but if possible I'd like to switch cause of issues, but this is holding me back - was there a take on in the last years and I just didn't found it?

Or do I need to update the old datas which are already plotted, to insert the data?

Thanks so far & Have a great cup of coffee Lars

milahu commented 2 weeks ago

do I need to update the old datas which are already plotted, to insert the data?

you will have to preprocess your data, so all graphs have the same time resolution, i your case 30Hz

A bluetooth scale e.g. 10 values per second, a pressure sensor 30 values per second, a temperature sensor maybe 5 values.

PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP @ 30Hz
S  S  S  S  S  S  S  S  S  S  S  S  S  S  S  S  S @ 10Hz
T     T     T     T     T     T     T     T     T @  5Hz

to get 30Hz, repeat all S values 3 times, all T values 6 times

leeoniya commented 2 weeks ago

you can keep a separate data buffer for each device and use uPlot.join() just prior to calling u.setData(joinedBuffers)

graphefruit commented 2 weeks ago

Thanks for the fast responses!

you can keep a separate data buffer for each device and use uPlot.join() just prior to calling u.setData(joinedBuffers)

Is there any sample I can quickly have a look at?

leeoniya commented 2 weeks ago

you can search the demos folder here for "uPlot.join`.

e.g. https://github.com/leeoniya/uPlot/blob/master/demos/nearest-non-null.html

leeoniya / uPlot

feature: multiple x axis to combine different time precisions #405