Closed SNapier closed 6 months ago
Reboot of target machine and checks have come into XI
Logs on agent do not reflect checks executed still only has the one entry.
I have the same problem. Here is example of config and what the ncpa_passive.log says.
This error occurs when your plugin/endpoint isn't responding properly. I typically occurs when you are either trying to access an endpoint that doesn't exist or if you're using a plugin and your plugin isn't properly returning anything.
For example, in @rob2791's case, there is an issue with the logs endpoint. This may be that that log doesn't exist. Did you verify that you can access that endpoint through the API and that it gives a valid output?
@SNapier can you post the check you have that's giving an error?
Only checks configured on this host are the defaults in the example.cfg
Can I get the passive log before the error so I can identify which endpoint is breaking?
This error occurs when your plugin/endpoint isn't responding properly. I typically occurs when you are either trying to access an endpoint that doesn't exist or if you're using a plugin and your plugin isn't properly returning anything.
For example, in @rob2791's case, there is an issue with the logs endpoint. This may be that that log doesn't exist. Did you verify that you can access that endpoint through the API and that it gives a valid output?
@SNapier can you post the check you have that's giving an error?
@ne-bbahn Everything was working fine before the latest update. I was pulling log events from about 10 servers on various things. Unfortunately, our enterprise update software automatically updated the NCPA package.. and I walked in one morning to all the errors. Nothing else changed except NCPA. Unless there is something different with this version that needs to be in the CFG?
Can I get the passive log before the error so I can identify which endpoint is breaking? Unfortunately there are no prior logs, this is a fresh install.
@ne-bbahn Everything was working fine before the latest update. I was pulling log events from about 10 servers on various things. Unfortunately, our enterprise update software automatically updated the NCPA package.. and I walked in one morning to all the errors. Nothing else changed except NCPA. Unless there is something different with this version that needs to be in the CFG?
I ran into this yesterday as well. If the repos that are created when installing XI are not marked to be ignored, the latest and greatest gets installed when running the patching with Yum.
so this is the entirety of your passive log?:
2023-11-17 09:20:55,678 passive ERROR Stdout or returncode was None, cannot return meaningfully. Traceback (most recent call last): File "ncpa.py", line 319, in run_all_handlers File "passive\nrdp.py", line 113, in run File "passive\nrdp.py", line 95, in get_xml_of_checkresults File "passive\nrdp.py", line 54, in make_xml File "passive\ncpacheck.py", line 84, in run ValueError: Stdout or returncode was None, cannot return meaningfully.
Can you check those API endpoints manually to verify that they're working?
Do any passive checks show up in the interface under Checks
?
I've got the same Error.
2024-02-26 11:57:15,809 passive ERROR Stdout or returncode was None, cannot return meaningfully. Traceback (most recent call last): File "ncpa.py", line 339, in run_all_handlers File "passive\nrdp.py", line 113, in run File "passive\nrdp.py", line 95, in get_xml_of_checkresults File "passive\nrdp.py", line 54, in make_xml File "passive\ncpacheck.py", line 84, in run ValueError: Stdout or returncode was None, cannot return meaningfully.
NCPA Version is 3.0.1, but I've had it in 3.0.0 too.
The weird thing is, that we run NCPA on almost 200 Windows Servers but only our Microsoft Terminal Servers have this issue. AND it does only happen sporadically. Most of the time it works well but every now and then (time between incidences varies), on one of the Terminal Servers (randomly which one) all checks go red. I have to restart the ncpa service on that server and everything works again ... for some days.
Could you explain what you mean by "API endpoints"? In my case I tend to believe its a passive check on a windows service by using the build in services feature. in nrdp.cfg is:
%HOSTNAME%|Check-Name|420 = services?service=stunnel&status=running
I've checked the last runned check in the ncpa webinterface under services for the system. The check on stunnel service is the one which should be run next, but instead there is just nothing until the ncpa service restarts.
If that service doesn't exist, it actually should give back an unkown state with the info that this service couldn't be found on the the system, am I right? ... I mean ... I don't know why it shouldn't exists there because after restart all checks, including the for stunnel service, are "OK" again for some hours/days.
Could you explain what you mean by "API endpoints"? In my case I tend to believe its a passive check on a windows service by using the build in services feature. in nrdp.cfg is:
%HOSTNAME%|Check-Name|420 = services?service=stunnel&status=running
You can check your checks in the interface via the API section under https://ip_address:5693
or by querying https://ip_address:5693/api/endpoint/path
which could be https://localhost:5693/api/plugins/testplugin
So if you wanted to check the above check command via the API, you could check https://localhost:5693/api/services?service=stunnel&status=running
to see if NCPA is handling the check properly or not.
If that service doesn't exist, it actually should give back an unkown state with the info that this service couldn't be found on the the system, am I right? ... I mean ... I don't know why it shouldn't exists there because after restart all checks, including the for stunnel service, are "OK" again for some hours/days.
You are definitely correct here. I've taken a look at some of the code relating to checks and it's definitely not handled correctly. I'll try to get it working properly for 3.0.2
@ne-bbahn
You can check your checks in the interface via the API section under
https://ip_address:5693
or by queryinghttps://ip_address:5693/api/endpoint/path
which could behttps://localhost:5693/api/plugins/testplugin
So if you wanted to check the above check command via the API, you could checkhttps://localhost:5693/api/services?service=stunnel&status=running
to see if NCPA is handling the check properly or not.
Thanks for the input. Actually had this with 3 of our Terminal Servers again and checked on one of them. The server has still not send any results to the nagios server and if I query this via api as you've mentioned, I get an json result with every running service.
{"services": {"stunnel": "running", "AdobeARMservice": "running", "AGSService": "running", [...], }}
I dont think this is the wanted behavior, am I right?
Some more Info:
Response-Headers: Access-Control-Allow-Origin: * Content-Length: 4891 Content-Security-Policy: frame-ancestors 'self' Content-Type: application/json Date: Tue, 05 Mar 2024 13:51:40 GMT Strict-Transport-Security: max-age=31536000; includeSubDomains Vary: Cookie X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN
Request-Headers: Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8 Accept-Encoding: gzip, deflate, br Accept-Language: de,en-US;q=0.7,en;q=0.3 Connection: keep-alive Host: hostname.subdomain.tld:5693 Sec-Fetch-Dest: document Sec-Fetch-Mode: navigate Sec-Fetch-Site: none Sec-Fetch-User: ?1 Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0
Thanks for the input. Actually had this with 3 of our Terminal Servers again and checked on one of them. The server has still not send any results to the nagios server and if I query this via api as you've mentioned, I get an json result with every running service.
{"services": {"stunnel": "running", "AdobeARMservice": "running", "AGSService": "running", [...], }}
I dont think this is the wanted behavior, am I right?
It would seem that NCPA is using an inclusive search where having both service=stunnel
and service=running
to return both the service named stunnel and all the services that are running. If you want it to check whether the service/services specified are running, you need to add check=true
to make it run as a Nagios check. This does seem to be the intended behavior.
This is solved in NCPA 3.0.2. If you continue to have issues, we can reopen this and discuss.
NCPA is not sending passive checks to XI via NRDP.
From the Passive log post fresh install on Win10