DrPsychick / ansible_influx_downsampling

Ansible role to setup downsampling with continuous queries on influxDB, optionally with backfilling and compaction of existing raw data.
GNU General Public License v3.0
17 stars 4 forks source link

Idea: Dynamically generate CQ/BF SELECT queries based on source database #14

Open tonyduckles opened 3 years ago

tonyduckles commented 3 years ago

[First, thanks writing this code/project! I had a similar use-case and problem: using Telegraf+InfluxDB stack to do monitoring and wanted to downsample data for different retention lengths. This project took care of most of the downsampling details for me, because you had already worked-out all the details. Cheers!]

For my Telegraf monitoring setup, I had a few more Telegraf plugins enabled than what you had pre-configured in the defaults/main.yml >> ansible_influx_queries list. I found it a bit cumbersome to have to manually build the mapping table of measurements -> SELECT query syntax.

After reading through the code, I had the idea that it might be neat to auto-generate the SELECT queries by dynamically building the list of measurements from the source Influx database. For example, there's already code in tasks/influxdb_measurement.yml to "Get fields from SOURCE". Based on the data-type of each measurement, we can dynamically generate the SELECT query. Example: for numeric-type fields, default to the mean aggregation-function; for string-type or boolean-type fields default to the mode aggregration-function. And then defaults/main.yml could have a sparsely-populated table of "override" agg-functions for specific measurements -- e.g. some of the measurements are counters and we want to use the max agg-function.

I worked up the above idea and got it working for my own personal use-case. Does this idea seem interesting to you (or anyone else monitoring this repository)? If so then I can work-up a pull-request and send that your way.

Cheers!

DrPsychick commented 3 years ago

Hey @tonyduckles , I'm really glad this project helped you! I've put in a lot of effort and was hoping it could be of use for others. So thanks for giving me that feedback!

I like the idea very much, also because that would potentially also allow to not miss some metrics in aggregation.

I'd love to get a PR and test/review it! I think the only requirement would be that manually added queries take precedence and/or the feature can be optionally enabled.

I'm not sure what a good default would be for this feature.

Thank you for your input!

tonyduckles commented 2 years ago

[Sorry, this fell over my radar. Swinging back to this now!]

FYI, I pushed my assorted work-in-progress changes to a "develop" branch in a fork of your repo: tonyduckles/develop.

There are some assorted (unrelated) changes lurking in that "develop" branch, but the relevant changes for this enhancement-idea-thread are all in this commit: Dynamically generate CQ/BF SELECT queries based on source database.

Some notes/thoughts:

Take a look through the commit (at your leisure). Let me know if you have any thoughts/comments/feedback. If this all seems reasonable, then I can spend some more time polishing up the code and submit a PR.

DrPsychick commented 2 years ago

looks good to me. You're doing some crazy magic in mm_select_fields 😉

I'd love to have that covered with a test if we manage to do that. That is: a) generated aggregation works b) overrides work (backwards compatibility).

Poke me: slack at drsick.net, if you want to join and chat there.