Closed alexoalexo closed 7 months ago
Here is my log file (level set to trace):
_[2024-02-27 13:06:54.429][Trace][pid:1192][listen_nrpe:105] nrpe v2 request: generate_rvtools_report []string{}
[2024-02-27 13:06:54.432][Trace][pid:1192][snclient:706] command: generate_rvtools_report
[2024-02-27 13:06:54.432][Trace][pid:1192][snclient:707] args: []string{}
[2024-02-27 13:06:54.432][Debug][pid:1192][checkwrap:63] command before macros expanded: RVToolsBatchMultipleVCs.ps1
[2024-02-27 13:06:54.445][Trace][pid:1192][snclient:1327] command object:
path: C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
args: -
SysProcAttr: &{true powershell -WindowStyle hidden -NoLogo -NonInteractive -Command ". 'C:\Program Files\snclient\scripts\RVToolsBatchMultipleVCs.ps1' ; exit($LASTEXITCODE)" 0 0
It looks like my script "RVToolsBatchMultipleVCs.ps1" exits with code 3? But why snclient is sending back such message?
Thanks
When I run my script manually it always exits with 0. It has 'exit 0' at the end. So I don't know I get error code 3?
which snclient version is this? I have reworked the scripts/wrapped scripts part last week. Could you try the current build from https://github.com/ConSol-Monitoring/snclient/actions/runs/8052511606 (you need to scroll down to the bottom of that page)
I'm running SNClient+ v0.18 (Build: 706454c, go1.21.7)
Same thing with SNClient+ v0.18.0046 (Build: e3bc0fd, go1.21.7). I get error 3 I'm running it as .ps1 file and also calling it from a a .bat file and always get exit code 3
[2024-02-27 14:08:12.729][Trace][pid:3708][snclient:1066] error: &fmt.wrapError{msg:"timeout: context deadline exceeded", err:context.deadlineExceededError{}}
i see, so it seems like the timeout isn't used at all and the default timeout is used instead?
When the script finishes quickly - it returns 0 and all is fine. Can you revisit the timeout settings?
When it takes more than a minute it always fail, so probably the timeout setting in the config file is not applied properly :
root@nagiosserver:/usr/lib/nagios/plugins# time ./check_nrpe
C:\Program Files\snclient>start "RVTOOLS-Export-Script"/DC:\R
real 0m14.860s
user 0m0.009s
sys 0m0.001s
root@nagiosserver:/usr/lib/nagios/plugins# time ./check_nrpe
C:\Program Files\snclient>start "RVTOOLS-Export-Script"/DC:\R
real 0m14.874s
user 0m0.009s
sys 0m0.000s
root@nagiosserver:/usr/lib/nagios/plugins# time ./check_nrpe
C:\Program Files\snclient>start "RVTOOLS-Export-Script"/DC:\R
real 0m36.078s
user 0m0.007s
sys 0m0.003s
root@nagiosserver:/usr/lib/nagios/plugins# time ./check_nrpe
UNKNOWN - script run into timeout after 666s
real 1m8.550s
user 0m0.005s
sys 0m0.005s
root@nagiosserver:/usr/lib/nagios/plugins#
it should be better now, there was indeed a 60s default timeout and since the shortest one wins, it didn't matter how long the script timeout was. So now at least a debug log entry will be added in this case and it should work if you set both, the default timeout and the scripts timeout long enough:
[/settings/default]
timeout = 300
[/settings/external scripts]
timeout = 300
If that's a good idea is another story...
Setting both now, but still it defaults to 60, here's the trace log:
[2024-02-28 09:54:58.799][Debug][pid:6416][check_wrap:81] reduced cmd timeout to 59s because of shorter context
[2024-02-28 09:54:58.807][Trace][pid:6416][snclient:1324] command object:
path: C:\Windows\system32\cmd.exe
args: (none)
SysProcAttr: &{true C:\Windows\system32\cmd.exe /c C:\Program^ Files\snclient\scripts\generate_rvtools.bat.bat 0 0 <nil> <nil> false [] 0}
[2024-02-28 09:55:49.789][Trace][pid:6416][task_logrotate:93] check logfile rotation (threshold 10 MiB / current size: 2531 B)
[2024-02-28 09:55:57.814][Debug][pid:6416][snclient_windows:207] invalid argument
[2024-02-28 09:56:06.931][Trace][pid:6416][snclient:1063] stdout:
[2024-02-28 09:56:06.931][Trace][pid:6416][snclient:1064] stderr:
[2024-02-28 09:56:06.931][Trace][pid:6416][snclient:1065] exitCode: 3
[2024-02-28 09:56:06.931][Trace][pid:6416][snclient:1066] timeout: 59
[2024-02-28 09:56:06.931][Trace][pid:6416][snclient:1067] error: &fmt.wrapError{msg:"timeout: context deadline exceeded", err:context.deadlineExceededError{}}
@sni , can you please re-open the issue and check. I'm ready to test and provide more info if needed. Thank you
For all scripts the timeout is set to 60, and I have
[/settings/default] timeout = 300
[2024-02-28 11:06:32.852][Trace][pid:2024][snclient:1066] timeout: 60
i guess the issue is now related to the nrpe listener, i only checked the rest api yesterday.
should be fixed in the latest build. Could you try again?
Thanks, looks good now 👍
On longer checks 60+ seconds I get Ok/green in Nagios
Check Latency / Duration: | 0.000 / 94.094 seconds -- | --
I have a number of external scripts that I run and all are running fine except one of them. I always get a timeout from snclient although the script is successfuly started and finishes prior to reaching the defined timeout.
For example timeout set to 666, script finishes on 1min but I still get: script run into timeout after xxx
[/settings/external scripts] ; timeout - The maximum time in seconds that a command can execute. (if more then this execution will be aborted). ; NOTICE this only affects external commands not internal ones. timeout = 666
`root@nagiosserver:/usr/lib/nagios/plugins# time ./check_nrpe -d 2 -2 -t 360:unknown -H 1.1.1.1 -c generate_report UNKNOWN - script run into timeout after 666s
real 1m0.479s user 0m0.006s sys 0m0.004s ` What should be my next troubleshooting steps? Thank you