Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
1.99k stars 573 forks source link

Deletion of Downtimes after 19.01.2038 fails for IDO-MySQL #6064

Closed ekeih closed 4 years ago

ekeih commented 6 years ago

Expected Behavior

When you add or delete a downtime that starts or ends after 19.01.2038 it should also be deleted from IDO-MySQL.

Current Behavior

When you schedule a downtime after 19.01.2038 Icinga2 inserts the downtime with NULL as scheduled_downtime_end:

+----------------------+--------------------+
| scheduled_start_time | scheduled_end_time |
+----------------------+--------------------+
| 2018-02-02 13:28:44  | NULL               |
+----------------------+--------------------+

(In Icingaweb2 this downtime will appear with an empty expiration time.)

When you delete the downtime via Icingaweb2 or API the downtime will be removed from Icinga2 itself and the following query is executed:

DELETE FROM icinga_scheduleddowntime WHERE entry_time = FROM_UNIXTIME(1517574524) AND instance_id = 1 AND name = 'myhost!master01.stage.icinga2.mycompany-1517574524-223' AND object_id = 73234 AND scheduled_end_time = FROM_UNIXTIME(2148294524) AND scheduled_start_time = FROM_UNIXTIME(1517574524)

The scheduled_end_time = FROM_UNIXTIME(2148294524) part gets evaluated by MySQL to scheduled_end_time = NULL:

mysql> SELECT FROM_UNIXTIME(2148294524);
+---------------------------+
| FROM_UNIXTIME(2148294524) |
+---------------------------+
| NULL                      |
+---------------------------+

Unfortunatley scheduled_end_time = NULL in a WHERE-clause does not work as expected (https://dev.mysql.com/doc/refman/5.7/en/working-with-null.html), it returns an emtpy set.

To delete the downtime the query could use WHERE scheduled_end_time IS NULL.

When we restart Icinga2 the downtimes are removed from the database and the webinterface is in a consistend state again. I assume Icinga2 executes different queries during the initial config dump.

Relates to:

Possible Solution

  1. MySQL should handle timestamps after 19.01.2018 correctly. Maybe this will happen in https://bugs.mysql.com/bug.php?id=12654
  2. As the MySQL bug will probably stay for a while Icinga2 should handle the deletion of downtimes itself. Actually the creation also needs to get fixed as the timestamp gets lost during the insertion. (Store the timestamps in another way, do not allow timestamps after 2038, something else...)
  3. For users: Use PostgreSQL instead of MySQL or do not schedule downtimes that trigger the issue. Restart Icinga2 to remove the faulty downtimes.

Your Environment

ekeih commented 6 years ago

In hindsight a title like "Creation of downtimes after 19.01.2038 fails for IDO-MySQL" would be better as the described behavior is only a result of the wrong insertion of scheduled_end_time = NULL.

dnsmichi commented 6 years ago

To be honest: If you schedule a downtime which lasts more than 20 years, you can really remove the host from your monitoring instead. Or disable notifications entirely, and fake your SLA reports later on :-P

The issue with the 32 bit overflow is not new, hopefully MySQL will take care carefully.

ekeih commented 6 years ago

I agree that the value of such a downtime is questionable.

It is still a bug that Icinga2 stores different information than the user passed. At least timestamps above 32bit should result in an error instead of a wrong database entry. This could be a workaround for a few years until MySQL is fixed.

dnsmichi commented 6 years ago

Apparently MySQL does it wrong, and your described bug would just be an application side workaround. Dunno how other applications solve this problem, do you have a possible patch at hand?

ekeih commented 6 years ago

Yes, it would be a workaround. But currently Icinga2 stores different information in the database than a user would expect. In my opinion this should not happen.

No, I do not have a patch at hand as I have no idea how to write C++ or how as possible patch would affect other parts of Icinga2.

In case you do not implement a workaround: Would you accept a PR for the documentation that explains the problem?

dnsmichi commented 5 years ago

We can add an entry for the troubleshooting docs, if necessary. Users will highly like find this issue with Google anyways.

ekeih commented 5 years ago

Feel free to assign the ticket to me. (I can't assign it to me myself.)

Al2Klimov commented 5 years ago

I'm sorry @dnsmichi. I'm afraid MySQL won't "take care". I clearly remember the following sentence from a MySQL bug report:

Welcome to MySQL. This won't be fixed – it can be worked around.

In addition the linked issue seems to be a "feature request".

And yes, almost 20 years are a long time, but the longer we wait, the bigger the problem becomes. I'd either upgrade the IDO schema or introduce IcingaDB.

dnsmichi commented 5 years ago

I'm waiting up until there's a new backend introduced, there's no effort taken to fix this in IDO for now. This is just a reminder ticket to take care in the future.

dnsmichi commented 4 years ago

We'll never reach the support target close to 2038 or 2028 even with the IDO schema. As such, closing as wont-fix.