Closed robbsie closed 6 years ago
So DbConnection::CleanUpHandler
seems to use unindexed queries, which is pretty bad.
Comments and general history grow pretty huge, especially on larger systems.
commenthistory
only has one Index on MySQL:
UNIQUE KEY instance_id (instance_id,object_id,comment_time,internal_comment_id)
Same for PgSQL:
CONSTRAINT UQ_commenthistory UNIQUE (instance_id,object_id,comment_time,internal_comment_id)
@N-o-X something for you? We should make sure that the other tables we cleanup also have an index on entry_time
We'll discuss this with @lippserd as there might be more schema fixes in terms of performance coming.
All who are involved, please make one step back and try to look at the whole picture. There is nothing wrong with trying to optimize the DB for millions of comment history entries. Just, it makes no sense. The original comment points to the main issue:
We've set commenthistory_age = 30d and got more than 7mio entry in this table. The real number of made comments is only a few hundred or something.
This should tell us that this table contains 99% garbage. No need for tuning here, please find out where the waste comes from.
Cheers, Thomas
@Thomas-Gelf valid point, technically 2 issues here...
We figured out that 6,9mio entrys (!) are quite the same and have been made within 3 hours:
mysql> select COUNT(*) from icinga_commenthistory where name="<servername>!Elasticsearch 2local!monitor.<our_hostname>.de-1501852997-2";
+----------+
| COUNT(*) |
+----------+
| 6963264 |
+----------+
1 row in set (0.00 sec)
We have now cleared the table but made a dump for a deeper look, if needed. Now there are only 417 valid comments left in this table.
We have no idea why the table was so huge. Maybe it has something to do with cluster logic, because the table was so huge on both cluster nodes.
I would believe this originates from an old bug in < 2.8 where the cluster downtime object creation plays "ping pong" (seen and fixed at a customer's site). Manual cleanup is the only possible method here. For the cleanup handlers - if they happen to be wrong, please open a separate issue.
Cheers, Michael
The auto deletion of old comments, which is triggered automatically every few seconds or even more often (don't know from where exactly, but we saw this permanently on our MySQL proc stat) is extremely slow with a large commenthistory.
The query looks like this:
DELETE FROM icinga_commenthistory WHERE instance_id = 1 AND entry_time < FROM_UNIXTIME(1499787760);
Reason seems to be a missing index on
entry_time
so this does always a full table scan which took ~30s on our commenthistory table with 7mio+ entrys.After adding an index to the
entry_time
row the problem should be gone:ALTER TABLE icinga_commenthistory ADD INDEX(entry_time);
The other point is, that the commenthistory table is incredible large! We've set
commenthistory_age = 30d
and got more than 7mio entry in this table. The real number of made comments is only a few hundred or something.It seems every Icinga2 reload fills the table with new data from the files stored in local disc, so we have all entrys many multiple times there. Is this normal and wanted behavior?
icinga2 --version
): r2.7.0-1icinga2 feature list
): api checker command compatlog ido-mysql influxdb livestatus mainlog notification