madrisan / hashicorp-vault-monitor

:key: HashiCorp Vault Monitoring Tool
Mozilla Public License 2.0
24 stars 4 forks source link

Null output in nagios plugin when warning or critical #6

Closed unix196 closed 4 years ago

unix196 commented 4 years ago

We have nagios3 in docker (~3400 checks). Added vault monitoring in nagios:

...
define service {
    use                     http-service
    host_name               some_server
    service_description     vault: expire token accessor token-for-nagios
    servicegroups           vault
    check_command           check_vault!token-lookup -address=some_server -token=s.O...5CK -token-accessor=ZQ...TA -output=nagios -warning=768h -critical=72h
}
cat /etc/nagios-plugins/config/vault.cfg
define command {
   command_name     check_vault
   command_line     /usr/lib/nagios/plugins/hashicorp-vault-monitor $ARG1$ 
}

When I try run check in CLI, everything is working:

 /usr/lib/nagios/plugins/hashicorp-vault-monitor token-lookup -address=some_server -token=s.O...5CK -token-accessor=ZQ...TA -output=nagios -warning=768h -critical=72h
vault WARNING - This (renewable) token will expire on Sat, 09 May 2020 05:32:23 UTC (1 week 5 days 1 hour 19 minutes 36 seconds left)
echo $?
1

Some other token:

/usr/lib/nagios/plugins/hashicorp-vault-monitor token-lookup -address=some_server -token="s.O...K" -token-accessor="hd...8" -output=nagios -warning=168h -critical=72h
vault OK - This (renewable) token will expire on Mon, 21 Jan 2030 08:52:06 UTC (9 years 38 weeks 5 days 4 hours 23 minutes 13 seconds left)
echo $?
0

In nagios web interface I see that - https://ibb.co/1K76F7p. I try in my check "vault: expire token accessor token-for-nagios" add single and double quotes, but problem with null output is remain. When check is generate null in web interface, I see her output in docker logs:

docker logs --tail 20 -f nagios3
vault WARNING - This (renewable) token will expire on Sat, 09 May 2020 05:32:23 UTC (1 week 5 days 49 minutes 28 seconds left)

I don't think that problem with nagios in docker ( we now run 3000 different checks without any trouble) + when hashicorp-vault-monitor generate "OK" - I see that in web interface. Trouble only when plugin generate "Warning" or "Critical" (also null outputs will be in other checks - I check that).

usr/lib/nagios/plugins/hashicorp-vault-monitor --version
HashiCorp Vault Monitor v0.8.4 ('7b2326ea73281891139e077aa39f2d91f83c493c+CHANGES')

For example, I found thread - https://www.linuxquestions.org/questions/linux-software-2/nagios-interprets-perl-plugin-output-as-null-948605/, where similar problem (plugin written on perl)

My issue is resolved. My plugin does file IO and wasn't opening a file for reading. Works from the command line possibly because I ran the script from the same directory as the script. Nagios runs the script from absolute path in another working directory.

May be same problem (or similar) in that plugin?

madrisan commented 4 years ago

Thank you for the report. I've an idea of the root cause of this issue...

madrisan commented 4 years ago

Is it possible to add 2>&1 at the end of nagios command_line definition? Does this fix the null output issue?

unix196 commented 4 years ago

Yes, this working! That's basically enough, but better mention in documentation.

madrisan commented 4 years ago

I will fix this directly in the code. This is just a workaround and a test to see if I got the cause of the problem.