Closed firmai closed 9 months ago
Hi @firmai,
I totally agree that we could make a larger effort to describe more of the inner workings of plotly-resampler (in the online docs) and our default data aggregation parameters.
However, I suggest, if time permits, to skim these papers:
t
ime s
eries downsample
algorithms; which is plotly-resampler
's default aggregation library from plotly-resampler>=0.9.0
; see release notes)Q: why would one not just downsample and create a graphic versus using the resampler, is the resampler doing it dynamically so that everytime you filter the dataset it provides the n preselected samples thus maintaining high fidelity all the way down?
A: It is exactly what you think it is! :) plotly-resampler
uses user-graph-interaction callbacks to resampler (i.e. perform time series data aggregation) for the interacted regions. (see 📷 ⬇️ - from the plotly-resampler paper)
Q: Also if one doesn't select the number of samples, is there a good approximate default that kicks in for n selection? In the documentation it is hard for me to understand what the default behaviour is, how many n is being selected and what algo, is it LTTB?
A: excellent question! We investigated these variables in this paper (https://arxiv.org/pdf/2304.00900.pdf). The default aggregator is MinMaxLTTB, you can think of it as a parallelizable variant of LTTB. The default number of selected samples for each aggregation $n{out}$ is set to 1000. However, an optimal $n{out}$ highly depends on (i) browser zoom level, (ii) graph canvas width, and (iii) line width. As determining these parameters (dynamically - as browser parameters can change over time) requires a lot of back-end and front-end logic, we have not (yet) put this on our roadmap to implement an automatic $n_{out}$ mode.
Additionally, one should also bear in mind that increasing $n_{out}$ will increase the network payload size and the front-end (re)rendering time, which may affect interactivity snappiness.
(see this README for more information on more accessible info w.r.t. visual representativeness: https://github.com/predict-idlab/ts-datapoint-selection-vis/blob/main/details/vis_representativity.md)
I hope this clarifies some stuff! And of course, thank you for taking a great interest in our research/software!
Kind regards, Jonas
Just looking at the two projects https://github.com/predict-idlab/tsdownsample/ and https://github.com/predict-idlab/plotly-resampler/
The documentation doesn't make the following very clear, why would one not just downsample and create a graphic versus using the resampler, is the resampler doing it dynamically so that everytime you filter the dataset it provides the n preselected samples thus maintaining high fidelity all the way down?
Also if one doesn't select the number of samples, is there a good approximate default that kicks in for n selection? In the documentation it is hard for me to understand what the default behaviour is, how many n is being selected and what algo, is it LTTB?
And thanks for the software!
Best, Derek