Closed milahu closed 3 years ago
and repeat values with lower frequencies (annual data: same value for all 52 weeks) (please tell me im wrong ..)
you're wrong :)
you need to fill them with null
and set spanGaps: true
for those series.
there is now a utility function that can do this for you called uPlot.join()
. see how it's used in https://github.com/leeoniya/uPlot/blob/master/demos/path-gap-clip.html#L120
as i said ..
possible workarounds: use nulls/gaps to encode missing values show only every N-th circle (others?) .. but these still require to merge x values into one axis
this still *feels* like a non-ideal solution (or is my optimization premature?)
the feature would allow a space-time tradeoff
and as side-effect, allow to plot MISO functions f(d1, d2) -> d3
and: easily plot data with different x ranges
there is now a utility function that can do this for you
sweet, but on runtime, i want to do as little work as possible all my data is precompiled/cached to an optimal format so i can dynamically add/remove data with little cost
one problem: we get more snap points choice: snap to nearest value (multiple x) or show average value (one x)
the complexity of doing anything else will be significant. the overhead does become significant for aligning many completely unaligned datasets that are several thousand points each, but in what i think are typical cases, i've tried to make the join function as efficient as possible.
i have an unlisted synthetic demo that allows you to assess the alignment cost here: https://github.com/leeoniya/uPlot/blob/master/demos/align-data.html. i don't expect real-world cases to be random()
-levels of unaligned, so it's a good stress test.
i'd be interested to see your actual datasets and how much it costs to align them.
the complexity of doing anything else will be significant.
im happy to help .. or do you mean SIGNIFICANT?
i'd be interested to see your actual datasets and how much it costs to align them.
simple, im plotting global population data over the last 500 years (or more)
where datasets have different lengths and resolutions
workaround: use "now" as zero index and count back
(i assume plotting stops at x == data[s].length
)
edit: this would make #107 easier to solve = plot SIMO fn
naah, only useful for MISO fnsf(d1) -> (d2, d3, d4)
im happy to help .. or do you mean SIGNIFICANT?
you're welcome to help. but, yes, i think it will be very substantial as the underlying data format assumptions permeate many parts of the internals. it's not gonna be as simple as tweaking just the pathbuilder, for example. the probability of not breaking many things to get this done is basically zero, imo.
the ultimate question is, what gains do you expect in non-artificial cases, and can you prove that this will work robustly and generally. if someone tells me that they need better perf on a 10M pts scatter dataset, the simple answer is that this is not the right library to use - at some point, this just becomes true. so, it's important to evaluate real use-cases, costs and possible gains from this effort.
simple, im plotting global population data over the last 500 years (or more)
this is an arbitrary number. is that 500 datapoints? 500 52 datapoints? 500 365 24 3600 datapoints? as i said, i would be interested to see how uPlot.join()
performs on your dataset, and its details.
edit: this would make #107 easier to solve
scatter/bubble is much easier (not easy) to solve with a different underlying data structure (as described there), which is the plan. if you'd like to help with that, i think it will be a much better use of time :) i don't plan to work on that for a few more months due to other work, so cannot promise a timely PR review either, unfortunately.
gonna close this since i don't think there is anything actionable here. feel free to follow up in this thread if you have perf issues with a specific use-case/dataset.
Hey @leeoniya, Sorry to push this topic again up. I came here explicit for this.
My use case: I develop a coffee app (https://github.com/graphefruit/Beanconqueror) which connects to different bluetooth devices. Each bluetooth devices sends in his own timestamps his data. A bluetooth scale e.g. 10 values per second, a pressure sensor 30 values per second, a temperature sensor maybe 5 values. So-> This differs into different unixtimestamps, but all shall be displayed in the same axis.
Actually I use Plotly but if possible I'd like to switch cause of issues, but this is holding me back - was there a take on in the last years and I just didn't found it?
Or do I need to update the old datas which are already plotted, to insert the data?
Thanks so far & Have a great cup of coffee Lars
do I need to update the old datas which are already plotted, to insert the data?
you will have to preprocess your data, so all graphs have the same time resolution, i your case 30Hz
A bluetooth scale e.g. 10 values per second, a pressure sensor 30 values per second, a temperature sensor maybe 5 values.
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP @ 30Hz
S S S S S S S S S S S S S S S S S @ 10Hz
T T T T T T T T T @ 5Hz
to get 30Hz, repeat all S values 3 times, all T values 6 times
you can keep a separate data buffer for each device and use uPlot.join()
just prior to calling u.setData(joinedBuffers)
Thanks for the fast responses!
you can keep a separate data buffer for each device and use
uPlot.join()
just prior to callingu.setData(joinedBuffers)
Is there any sample I can quickly have a look at?
you can search the demos folder here for "uPlot.join`.
e.g. https://github.com/leeoniya/uPlot/blob/master/demos/nearest-non-null.html
assume we have multiple time series with different time precisions / resolutions / divisions: some series are value per year, others are value per month, others are value per week
currently we must preprocess our data to match the highest value frequency (here: value per week) and repeat values with lower frequencies (annual data: same value for all 52 weeks) (please tell me im wrong ..)
here is a plot of annual and monthly values:
what is ugly here are the circles in the white annual line - they are too many
disabling the circles completely is a bad solution
possible workarounds: use nulls/gaps to encode missing values show only every N-th circle (others?) .. but these still require to merge x values into one axis
possible solution: currently,
data[0]
holds the x values, and all otherdata[i]
hold y values we could introduce a mapping between arrays, mapping x values to y values or more general, map input values to output values@leeoniya please share your thoughts so i can make a better PR : )