InfluxDB uses a default retention policy that keeps data forever in 7 day shards - in RAW format (data points every 10 or 30 seconds, depending on your input configuration). Of course this is a good default, but once you have old data and want to introduce downsampling without loosing data, its a lot of manual work to setup all the queries etc.
So ... I have done this for you!
Two usage scenarios:
Honestly the two use cases are not much different. The biggest difference is the time it takes to run through the playbook when you enable backfilling. Of course, if you work on existing data, don't forget to have a proper backup!
See the result in action:
This requires to setup multiple datasources in grafana (one for each aggregation level) and a "Datasource" variable, like so:
As preparation you don't need much, except knowing how exactly you want to downsample your data as you need to setup your configuration first. Once configured, you can run each level separately and/or repeatedly without any issue as the role only creates/adds what is missing (ok, CQs are always recreated as there is no way to alter them). This way you can iterate your way towards a final setup and add compaction as last step, once you're happy with the result.
Easiest setup is create a role in your own repository and adding this:
roles/influx-setup/tasks/main.yml
- name: "Include definition from influxdb_{{vars_name}}.yml"
include_vars: influxdb_{{vars_name}}.yml
when: vars_name is defined
# you can add your own tasks here that will be executed prior to `ansible_influx_downsampling` role
roles/influx-setup/vars/influxdb_frank.yml
--> take one from the examples directory as a base for your own: examples/
Now in your playbook, include both roles:
influx-setup.yml
- name: InfluxDB
hosts: localhost
roles:
- { role: influx-setup, vars_name: "frank" }
- { role: DrPsychick.ansible_influxdb_downsampling}
Now the most tricky part. You need to know what aggregation levels you want and define them. Moreover, especially if you don't use standard telegraf plugins for input, you need to define the aggregation queries for your measurements.
InfluxDB has this awesome feature to do dynamic aggregation (SELECT MEAN(*) FROM :MEASUREMENT GROUP BY *
), but it automatically prepends mean_
to every field in the measurement and this would mean, you cannot reuse the dashboard you use on your RAW data.
The solution is simple, but requires work. You have to name your columns individually as the original column, idependent of the aggregation method you use.
Example:
SELECT MEAN(usage_user) FROM cpu; # on RAW
SELECT MEAN(usage_user) AS usage_user INTO telegraf_14d.rp_14d.cpu FROM cpu; # in aggregate queries
The good thing about defining the aggregation method individually is that you can mix it and choose what is best for the individual column. And once it's defined, it can be reused for all aggregation levels. Additionally you can choose to add more aggregations (min, mean and max) and have a dashboard which includes them.
SELECT LAST(string_value) AS string_value # for strings
SELECT MAX(users) AS users # for metrics where you're interested in the MAX
...
The configuration goes into your vars
file or you can choose to setup global vars for it.
If you use generic input plugins from telegraf or other typical sources, please add them to defaults/main.yml
and send me a pull request, so others can profit from them too.
Query definition or override:
my_ansible_influx_queries:
# override default settings, use german performance counter names
win_cpu: >
SELECT mean("Benutzerzeit_(Percent)") AS "Benutzerzeit_(Percent)"
, mean("DPC-Zeit_(Percent)") AS "DPC-Zeit_(Percent)"
, mean("Interruptzeit_(Percent)") AS "Interruptzeit_(Percent)"
, mean("Leerlaufzeit_(Percent)") AS "Leerlaufzeit_(Percent)"
, mean("Privilegierte_Zeit_(Percent)") AS "Privilegierte_Zeit_(Percent)"
, mean("Prozessorzeit_(Percent)") AS "Prozessorzeit_(Percent)"
A complete setup can be found in examples/full-5level-backfill-compact/
recreate_cqs=yes
and all CQs will be recreated, so the role is running through all measurements and levels of compaction (not touching existing data)If you enable backfill:
If you enable backfill and compact:
Results of my full setup can be found in examples/full-5level-backfill-compact/
Future Version:
Version 0.3: Complete incl. automatic compaction, tests and good examples.
Version 0.2: Fully working and tested. No deleting of data. Stats + CQ update.
Version 0.1: