Open dercol1 opened 1 year ago
sometimes the internal clock begin to be one day ahead the current date. This erroneous information is then forwarded to the icinga2 master server this make the next checks regarding the satellite completely stuck and not performed even asking to check.
"completely stuck" or stuck for one day (the period the clock was ahead)?
You are right, the "stuck" time stop after the clock skew is recovered. Sorry for the delay in the answer..... I was notified only today. Thank you
Does check now help?
I have a packaged icinga2 2.11.9 version in a product that is called neteye (4.28). I have a configuration where a master icinga2 installation (with icinga2-web) talk with remote satellites. One satellite is running in a oVirt KVM farm, caused by the low speed of the disks, sometimes the linux kernel of the satellite register a "CPU stuck" and sometimes the internal clock begin to be one day ahead the current date. This erroneous information is then forwarded to the icinga2 master server this make the next checks regarding the satellite completely stuck and not performed even asking to check.
The solution we found is to stop the icinga2-master server on the master node, remove the /neteye/shared/icinga2/data/lib/icinga2/icinga2.state and then restarting the icinga2-master service (that is the owner of the icinga2 processes). But this solution simply trow away all the configuration. Please give me some hint to address the problem. I know only partially the architecture