zabbix-community / helm-zabbix

Helm chart for Zabbix
https://artifacthub.io/packages/helm/zabbix-community/zabbix
Apache License 2.0
84 stars 48 forks source link

[zabbix-community/zabbix] Add cronjob to purge specifics tables #6

Closed david-moreau closed 2 years ago

david-moreau commented 2 years ago

Hi,

Is it possible to add cronjobs to purge the "alerts" table like the cronjob ha_nodes_autoclean with the possibility to enable it or not and set the number of days to keep ?

For more context, we use a notification action who resend alerts every 5 minutes. An entrie are create in the "alerts" table every 5min per alert not resolved.

David

aeciopires commented 2 years ago

Hi @david-moreau!

Thanks for report this feature.

Do you would like to open a Pull Request to implement this? We will appreciate your contribution.

Hi @sa-ChristianAnton!

What you think about this feature? Is possible? I don't know how to implement this.

david-moreau commented 2 years ago

Hi @aeciopires
I will try to make a pull request but it will take me some time.

To implement this feature, we can use the same structure like cronjob-hanodes-autoclean and launch this sql request found on this repo

DELETE FROM alerts WHERE (UNIX_TIMESTAMP(NOW()) - clock) > (@history_interval * 24 * 60 * 60);

I don't know if we need to add another table purge like 'events' or 'acknowledges'

David

sa-ChristianAnton commented 2 years ago

Hi! I have thought about this and actually don't believe this should be a feature of this Helm chart. The reason: Zabbix server writing into the alerts table is totally normal behavior and purging the table is more a workaround for a very specific usecase rather than something usually ever being necessary even in big productive environments.

What I would love to see instead as a pull request is a more generalistic approach: in values.yaml, a data structure "extraCronJobs" could be added which may contain a "pluggable" way to add this kind of things. I would even like the idea of supplying an example values.yaml containing exactly this implementation described for your specific use case.

What do you think, is this a good approach?

sa-ChristianAnton commented 2 years ago

Little additional information regarding the alerts table: it should really not be necessary to clean it using a cron job: The alerts table has constraints to the events table entries. Meaning that if an entry from events table that have been making the alert happen (events table contains "problems", alerts table contains "notifications sent" and "scripts executed"), the corresponding alerts table entries will also be deleted.

zabbix=# \d alerts
                                 Table "public.alerts"
    Column     |          Type           | Collation | Nullable |        Default
---------------+-------------------------+-----------+----------+-----------------------
 alertid       | bigint                  |           | not null |
 actionid      | bigint                  |           | not null |
 eventid       | bigint                  |           | not null |
 userid        | bigint                  |           |          |
 clock         | integer                 |           | not null | 0
 mediatypeid   | bigint                  |           |          |
 sendto        | character varying(1024) |           | not null | ''::character varying
 subject       | character varying(255)  |           | not null | ''::character varying
 message       | text                    |           | not null | ''::text
 status        | integer                 |           | not null | 0
 retries       | integer                 |           | not null | 0
 error         | character varying(2048) |           | not null | ''::character varying
 esc_step      | integer                 |           | not null | 0
 alerttype     | integer                 |           | not null | 0
 p_eventid     | bigint                  |           |          |
 acknowledgeid | bigint                  |           |          |
 parameters    | text                    |           | not null | '{}'::text
Indexes:
    "alerts_pkey" PRIMARY KEY, btree (alertid)
    "alerts_1" btree (actionid)
    "alerts_2" btree (clock)
    "alerts_3" btree (eventid)
    "alerts_4" btree (status)
    "alerts_5" btree (mediatypeid)
    "alerts_6" btree (userid)
    "alerts_7" btree (p_eventid)
    "alerts_8" btree (acknowledgeid)
Foreign-key constraints:
    "c_alerts_1" FOREIGN KEY (actionid) REFERENCES actions(actionid) ON DELETE CASCADE
    "c_alerts_2" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE
    "c_alerts_3" FOREIGN KEY (userid) REFERENCES users(userid) ON DELETE CASCADE
    "c_alerts_4" FOREIGN KEY (mediatypeid) REFERENCES media_type(mediatypeid) ON DELETE CASCADE
    "c_alerts_5" FOREIGN KEY (p_eventid) REFERENCES events(eventid) ON DELETE CASCADE
    "c_alerts_6" FOREIGN KEY (acknowledgeid) REFERENCES acknowledges(acknowledgeid) ON DELETE CASCADE

Events table cleanup is part of housekeeper and possible to set up in Administration->General->Housekeeping (default: 365 days).

Just checked that in a live installation using Postgres.

david-moreau commented 2 years ago

Hi @sa-ChristianAnton ,

You are right it's clearly a workaround and add extraCronJobs is a better idea.

I know the dependence between event table and alerts table. Currently I have to keep the events for several months but the repush of alerts every 5 minutes causes an alarming increase of the alerts table.

David

sa-ChristianAnton commented 2 years ago

closing issue as "not planned". Looking forward for pull requests.