Closed filosof86 closed 5 months ago
What distribution/version are you running? It looks like it's trying to process a lot of old performance data and the reaper is coming along and killing the process before it can finish. Which just makes for more old performance data and this process repeats. When was the last time it worked properly and what did you change that made it not work properly?
Hi @everwatch
What distribution/version are you running?
Dist: Oracle Linux 8
When was the last time it worked properly and what did you change that made it not work properly?
According to the reports I've been given, it worked properly on Nagios 3.0.6, issues started appearing after the update to Nagios v4 (4.4.9)
It looks like it's trying to process a lot of old performance data and the reaper is coming along and killing the process before it can finish.
Yes, but it didn't happen before and the data amount hasn't changed since then. In addition, AFAIU, performance data processing should have been improved in terms of implementing parallel processes, etc in Nagios 4x
Hi,
It looks like something is also incorrect with the custom perfdata processing script. I'll re-check that and get back to you. Thank you.
Hi
An update.
With the Processing Performance Data Using Commands
method, I had been getting a big load average and weird results when different commands/scripts could spark the timeout/LA issues (or could work OK).
Unfortunately, I cannot say for sure what was the reason.
Thus, I ended up setting the perfdata to be processed via files (Writing Performance Data To Files
method).
After that, the LA seems to get back to normal, and timeout issues have gone.
I believe this issue can be closed for now. Thank you.
Hi all,
The Nagios v4.4.9 I have noticed the issue described in the subject. The Nagios highly consumes CPU and there is no obvious reason (or I didn't find it) for that.
There are a lot of messages like the following in the logs:
As I can see there is indeed something wrong with perfdata processing. If we disable perdata processing via Nagios config
process_performance_data=0
Nagios starts working normally.I tried to remove the custom perf data processor and just configure the perfdata command like the following:
But it doesn't help. I get the impression that it's something lying under the hood of the
process-service-perfdata
functionality and it doesn't matter what script/command/whatever it launches.I've tried to enable Nagios debug, and made some 'strace' tests however, logs say that there is nothing unusual or incorrect. Just usual performing.
However, it appears the performance data processing noticeably slows down the Nagios and makes Nagios devour the system resources.
Could you please help me to sort that out?
P.S. I understand that the version I use is not the last one, but before updating (which is not that simple process in my case) I need to make sure whether it is something that is fixed in the last version or we're hitting another issue.
Many thanks in advance!