Icinga / icingaweb2-module-vspheredb

The easiest way to monitor a VMware vSphere environment.
https://icinga.com/docs/vsphere/latest
GNU General Public License v2.0
100 stars 34 forks source link

Add data retention period into server configuration #139

Open waja opened 4 years ago

waja commented 4 years ago

Expected Behavior

There should be a configurable data retention period for every vCenter. For the most cases it should not be needed to keep the data infinite.

Current Behavior

Actually the vsphere module is collecting data. With many events and/or vCenter systems the database will grow up quickly. While it is nice to have a look into the events it is not needed to keep them very long to for monitoring purpose itself. If anybody needs such an information later it can be extracted from vCenter logs itself.

Possible Solution

I would suggest to implement a configurable data retention period per vCenter via configuration option. While we still have a background process running, it would be great to kick a garbage collector at least once a day deleting data older then the configured retention periods.

Your Environment

necarnot commented 7 months ago

Hello,

Using vspheredb for more than two years (FTW), I second this issue because I'm seeing my db growing, especially about the event history for which tons of history does not fit our needs:

ncdu 1.15.1 ~ Use the arrow keys to navigate, press ? for help
--- /var/lib/mysql/vspheredb -------------------------
    2,4 GiB [##########]  vm_event_history.ibd
   36,0 MiB [          ]  alarm_history.ibd
   10,0 MiB [          ]  monitoring_rule_problem_history.ibd
    9,0 MiB [          ]  vm_hardware.ibd

Does anybody know how we can purge this event history? Does this require SQL queries or is there a way through the web GUI?

wp-perc commented 1 week ago

This is becoming kind of a priority for us and for all our customers. We would like to contribute with the necessary code.

The basic idea is to introduce a dedicated process/thread that will work in parallel, removing old data.

We actually have schema version 58 (and 59 donesn't introduces anything of concern for this task). Here a proposal:

  1. extend table vcenter_server with two mode columns: events_retention and alarms_retention, to specify retention days for each type of object to delete
  2. update web interface to allow these two values; both of them should be considered as optional
  3. introduce a separated thread/process that will perform the cleanup

About the cleanup, I think we should only remove data from vm_event_history and alarm_history tables. Queries should be something like this, repeated for all VCenters having retention configured.

UUID of VCenter should be obtained from table vcenter joined with vcenter_server, along with retention policies.

Here some examples of the deletes:

DELETE FROM alarm_history WHERE ts_event_ms < UNIX_TIMESTAMP(NOW() - INTERVAL 180 DAY) * 1000 AND vcenter_uuid = UNHEX('36AFF29192244EB0A3628B57D198A7F9');
DELETE FROM vm_event_history WHERE ts_event_ms < UNIX_TIMESTAMP(NOW() - INTERVAL 180 DAY) * 1000 AND vcenter_uuid = UNHEX('36AFF29192244EB0A3628B57D198A7F9');

If @Thomas-Gelf is fine with it, we will proceed with writing some code for the pull requests.