NagiosEnterprises / nagioscore

Nagios Core
GNU General Public License v2.0
1.53k stars 445 forks source link

Nagios [4.4.6] : Service seen as HARD / DOWN when the host is down but recover as : SOFT / OK #880

Closed nicolaskarp closed 1 year ago

nicolaskarp commented 2 years ago

Hello everyone,

We have an issue with Nagios 4.4.6 :

1/ When a host is down, the service checked at the same time is seen as : HARD DOWN :

[Thu Sep  1 15:46:27 2022] HOST EVENT HANDLER: ech_mag_fw;DOWN;SOFT;1;handle_host_event
[Thu Sep  1 15:47:37 2022] HOST EVENT HANDLER: ech_mag_fw;DOWN;SOFT;2;handle_host_event
[Thu Sep  1 15:48:47 2022] HOST EVENT HANDLER: ech_mag_fw;DOWN;SOFT;3;handle_host_event
[Thu Sep  1 15:49:50 2022] HOST EVENT HANDLER: ech_mag_fw;UP;SOFT;1;handle_host_event

Thu Sep  1 15:47:07 2022] wproc:   host=ech_mag_fw; service=check_perf_sla_internet_fo;
[Thu Sep  1 15:47:07 2022] Warning: Check of service 'check_perf_sla_internet_fo' on host 'ech_mag_fw' timed out after 60.005s!
[Thu Sep  1 15:47:07 2022] SERVICE ALERT: ech_mag_fw;check_perf_sla_internet_fo;CRITICAL;HARD;1;(Service check timed out after 60.01 seconds)
[Thu Sep  1 15:47:07 2022] SERVICE EVENT HANDLER: ech_mag_fw;check_perf_sla_internet_fo;CRITICAL;HARD;1;handle_service_event

2/ The service comes back : SOFT / OK

[Thu Sep  1 15:51:07 2022] SERVICE ALERT: ech_mag_fw;check_perf_sla_internet_fo;OK;SOFT;1;Packet Loss is OK : 0.000% < 3% : Latency is OK :  11.475 ms < 100 ms
[Thu Sep  1 15:51:07 2022] SERVICE EVENT HANDLER: ech_mag_fw;check_perf_sla_internet_fo;OK;SOFT;1;handle_service_event

The service never gets a "HARD / OK".

Tnank you.

Nicolas

sawolf commented 2 years ago

Hi Nicolas, thanks for reaching out. This doesn't seem right to me either - I should be able to look into it soon.

sawolf commented 1 year ago

I believe that this is a duplicate of #759 - I'm closing this issue, feel free to continue any discussion over there.