When any Icinga 2 host or service is acknowledged, but the corresponding comment went missing, the following happens and the Icinga 2 Event Stream Source fails to properly start working:
2024-07-19T11:28:27.409Z INFO icinga2 Start listening on Icinga 2 Event Stream {"source_id": 1}
2024-07-19T11:28:27.409Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:28:32.550Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "1s"}
2024-07-19T11:28:32.550Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:28:35.921Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "2s"}
2024-07-19T11:28:35.921Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:28:40.419Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "4s"}
2024-07-19T11:28:40.419Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:28:46.673Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "8s"}
2024-07-19T11:28:46.673Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:28:57.109Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "16s"}
2024-07-19T11:28:57.109Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:29:15.270Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "32s"}
2024-07-19T11:29:15.270Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:29:49.188Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "1m4s"}
2024-07-19T11:29:49.188Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:30:56.037Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "2m8s"}
2024-07-19T11:30:56.037Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
2024-07-19T11:33:06.841Z WARN icinga2 Catch-up-phase was interrupted by an error, another attempt will be made {"source_id": 1, "error": "fetching acknowledgement comment for \"master-1!icinga-cluster\" failed, found no ACK Comments for \"comment.entry_type == 4 && comment.host_name == comment_host_name && comment.service_name == comment_service_name\" with map[comment_host_name:master-1 comment_service_name:icinga-cluster]", "delay": "3m0s"}
2024-07-19T11:33:06.841Z INFO icinga2 Worker enters catch-up-phase, start caching up on Event Stream events {"source_id": 1}
This Situation can easily be created manually by acknowledging something and then deleting the corresponding comment from Icinga Web (this currently does not clear the acknowledgment, see also https://github.com/Icinga/icinga2/issues/8896), however, I'm quite certain that I didn't do this, so maybe there's even an "Icinga 2 forgets ack comments" bug here as well.
Anyways, the current error handling for this situation is too aggressive, this should not make the whole catch-up phase fail.
When any Icinga 2 host or service is acknowledged, but the corresponding comment went missing, the following happens and the Icinga 2 Event Stream Source fails to properly start working:
This Situation can easily be created manually by acknowledging something and then deleting the corresponding comment from Icinga Web (this currently does not clear the acknowledgment, see also https://github.com/Icinga/icinga2/issues/8896), however, I'm quite certain that I didn't do this, so maybe there's even an "Icinga 2 forgets ack comments" bug here as well.
Anyways, the current error handling for this situation is too aggressive, this should not make the whole catch-up phase fail.