Difference between tsdownsample and resampler

Just looking at the two projects https://github.com/predict-idlab/tsdownsample/ and https://github.com/predict-idlab/plotly-resampler/

The documentation doesn't make the following very clear, why would one not just downsample and create a graphic versus using the resampler, is the resampler doing it dynamically so that everytime you filter the dataset it provides the n preselected samples thus maintaining high fidelity all the way down?

Also if one doesn't select the number of samples, is there a good approximate default that kicks in for n selection? In the documentation it is hard for me to understand what the default behaviour is, how many n is being selected and what algo, is it LTTB?

And thanks for the software!

Best, Derek

Hi @firmai,

I totally agree that we could make a larger effort to describe more of the inner workings of plotly-resampler (in the online docs) and our default data aggregation parameters.

However, I suggest, if time permits, to skim these papers:

plotly-resampler: https://arxiv.org/pdf/2206.08703.pdf (describes how plotly-resampler works and its advantages over other tools)
tsdownsample: https://arxiv.org/pdf/2307.05389.pdf (describes what tsdownsample is, and some architectural choices; in short it is a library which contains highly optimized implementations of time series downsample algorithms; which is plotly-resampler's default aggregation library from plotly-resampler>=0.9.0; see release notes)

Q: why would one not just downsample and create a graphic versus using the resampler, is the resampler doing it dynamically so that everytime you filter the dataset it provides the n preselected samples thus maintaining high fidelity all the way down?

A: It is exactly what you think it is! :) plotly-resampler uses user-graph-interaction callbacks to resampler (i.e. perform time series data aggregation) for the interacted regions. (see 📷 ⬇️ - from the plotly-resampler paper)

Q: Also if one doesn't select the number of samples, is there a good approximate default that kicks in for n selection? In the documentation it is hard for me to understand what the default behaviour is, how many n is being selected and what algo, is it LTTB?

A: excellent question! We investigated these variables in this paper (https://arxiv.org/pdf/2304.00900.pdf). The default aggregator is MinMaxLTTB, you can think of it as a parallelizable variant of LTTB. The default number of selected samples for each aggregation $n{out}$ is set to 1000. However, an optimal $n{out}$ highly depends on (i) browser zoom level, (ii) graph canvas width, and (iii) line width. As determining these parameters (dynamically - as browser parameters can change over time) requires a lot of back-end and front-end logic, we have not (yet) put this on our roadmap to implement an automatic $n_{out}$ mode.

Additionally, one should also bear in mind that increasing $n_{out}$ will increase the network payload size and the front-end (re)rendering time, which may affect interactivity snappiness.

(see this README for more information on more accessible info w.r.t. visual representativeness: https://github.com/predict-idlab/ts-datapoint-selection-vis/blob/main/details/vis_representativity.md)

I hope this clarifies some stuff! And of course, thank you for taking a great interest in our research/software!

Kind regards, Jonas

predict-idlab / plotly-resampler

Difference between tsdownsample and resampler #247