Closed gefaila closed 2 years ago
Note that the RDP algorithm is iterative and so it's unlikely that it can be written in Flux.
@gefaila Thanks for this issue, I took a quick look at the algorithm and the request makes a lot of sense. This will be possible to implement in Flux and will be most efficient if implemented in Flux as opposed to doing so in the UI. I'll move this issue to the flux repo and present it to the team.
Thanks @nathanielc , I'm really excited about this because it's crucial to a whole level of functionality where we integrate our InfluxDB with a dashboard that is very limited on the number of points it can display.
Do you have an ETA, weeks?, months? years? Q2 2022? :-)
Most likely months. We plan to start on this in about a month and then I expect it will take a few weeks to actually build. Obviously this timeline is subject to change, but this is the current plan.
Another request for something similar using this algorithm has come in https://pkg.go.dev/github.com/dgryski/go-lttb. Maybe we can investigate both? See https://github.com/influxdata/flux/issues/4787
Another request for something similar using this algorithm has come in https://pkg.go.dev/github.com/dgryski/go-lttb. Maybe we can investigate both? See #4787
I think customers would need both because they excel in slightly different ways:
Clearly, algorithms that can see all the data before compressing are going to have some advantages over algorithms that only know current and historical data. Our choice was RDP. Swinging door wouldn't do as good a job on retaining the actual features in the data we value.
The RDP
algorithm has been implemented see https://github.com/influxdata/flux/pull/5078 Closing this issue, the Largest Triangle Three Buckets
algo issue will remain open and we can add it alongside RDP once its done.
This is great ! I have been looking forward to having this for a while! Excellent. Thanks for taking a look and implementing it!
@gefaila Glad to hear it, let us know how it works out for you. I have been using it personally to great success as well.
Proposal: Provide a function to implement the RDP algorithm found here or the advanced derivatives listed here
Current behavior: The only way to reduce data points are with the standard aggregate functions (min, max, mean, mode, first, last). There are a number of significant problems for reducing timeseries data with these algorithms
Desired behavior:
Alternatives considered: all existing aggregate functions in InfluxDB
Use case: Any industrial IoT use case where data compression is as important as NOT losing key features of the data.