NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
177 stars 95 forks source link

Question: Create ncpa check using API tab/Processes to alert on process exceeding x memory usage #847

Open Guyver1wales opened 2 years ago

Guyver1wales commented 2 years ago

I've asked the same question in the Nagios Community forum - https://support.nagios.com/forum/viewtopic.php?f=7&t=64333

I have a tomcat8.exe process that crashes when it exceeds the hard coded Java memory limit (currently set at 10GB) Our devops team would like me to warn and then alert when the tomcat8.exe process hits 9GB (warning) and then 9.5GB (Critical) so they can check the server and see what reports etc are running so they can identify if anything is badly constructed and taking the java memory allocation over 10GB.

As you can see from the forum post with screenshots I cannot seem to implement this as a nagios check for check_ncpa.py as mem_rss and mem_vms dont seem to work with the nagios check.

Is there any way to create this type of alert where I can warn/critical using ncpa when a process exceeds a defined memory usage?

sawolf commented 2 years ago

Hi @Guyver1wales,

Based on the forum thread, I’m assuming that your memory endpoint issues are due to a bug. That said, I’m not the maintainer for this project, so I’ll let Stephen make that determination when he has the chance.

As a workaround, I would recommend writing a short plugin (i.e. a powershell script) that checks the tomcat process memory. The plugin API is very simple - just have your script echo a short diagnostic message based on the result, and have it exit with a code that corresponds to the alert level (0 = OK, 1 = WARNING, 2 = CRITICAL, 3 = UNKNOWN). You can refer to the full spec here: https://nagios-plugins.org/doc/guidelines.html, but I don’t think you’ll need most of it for this one-off issue.

Regards,

Sebastian Wolf Developer Nagios Enterprises, LLC

Email: @.> @. Web: http://www.nagios.com/ www.nagios.com

From: Guyver1wales @.> Sent: Thursday, January 13, 2022 10:15 AM To: NagiosEnterprises/ncpa @.> Cc: Subscribed @.***> Subject: [NagiosEnterprises/ncpa] Question: Create ncpa check using API tab/Processes to alert on process exceeding x memory usage (Issue #847)

I've asked the same question in the Nagios Community forum - https://support.nagios.com/forum/viewtopic.php?f=7 https://support.nagios.com/forum/viewtopic.php?f=7&t=64333 &t=64333

I have a tomcat8.exe process that crashes when it exceeds the hard coded Java memory limit (currently set at 10GB) Our devops team would like me to warn and then alert when the tomcat8.exe process hits 9GB (warning) and then 9.5GB (Critical) so they can check the server and see what reports etc are running so they can identify if anything is badly constructed and taking the java memory allocation over 10GB.

As you can see from the forum post with screenshots I cannot seem to implement this as a nagios check for check_ncpa.py as mem_rss and mem_vms dont seem to work with the nagios check.

Is there any way to create this type of alert where I can warn/critical using ncpa when a process exceeds a defined memory usage?

— Reply to this email directly, view it on GitHub https://github.com/NagiosEnterprises/ncpa/issues/847 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6CNVAXMKLA2O2VWYEQZADUV33AXANCNFSM5L4I66GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Guyver1wales commented 2 years ago

Thanks Sebastian, I'll look into the Plugin API.

btrnka63 commented 2 years ago

Hi,

I'm not sure if help but we are using something similar > '--metric' 'processes' '--queryargs' 'mem_vms=0.5' -w 1 -c 2

It finds all the processes having VMS over 0.5GB (we use GB as unit) and returns the count. Then there is no problem to setup thresholds for returned count of processes - e.g., if 1 = Warning, 2 and more Critical.

Guyver1wales commented 2 years ago

I need to query a specific process (tomcat) explicitly, I do not need the noise of 'any' process going over a threshold, but thank you for taking the time to reply. I need to find some quiet time to read the plugin documentation and plugin a powershell script.

btrnka63 commented 2 years ago

hi, then just add the process name into the query args.: '--metric' 'processes' '--queryargs' 'name=tomcat,match=regex,mem_vms=0.5' -w 1 -c 2

yes, docu is a good friend but who should read all the pages ;)

PS: I highly recommend to check > https://localhost:5693/gui/api (for the server where you installed the NCPA agent)