naemon / naemon-core

Networks, Applications and Event Monitor
http://www.naemon.io/
GNU General Public License v2.0
153 stars 63 forks source link

Plugins execvp problem (for macros) #105

Open mboden opened 9 years ago

mboden commented 9 years ago

Hello,

According to the manual you should use this syntax for using macros as environment variables.

define command { command_name check_hpa command_line NAEMON_HOSTNAME=$HOSTNAME$ $USER2$/check_snmp_hpa/check_snmp_hpa.rb -H $HOSTADDRESS$ $_HOSTSNMP$ --directory /etc/naemon/conf.d/dynamic/check_snmp_hpa/ }

This doesn't work and the service generates this output. (No output on stdout) stderr: execvp(NAEMON_HOSTNAME=SC_cswitch01, ...) failed. errno is 2: No such file or directory

From a tip by Max Sikström he said I should use certain characters which would cause naemon to run it with system instead of execvp.

By adding $OVE at the end of the command_line it starts to work again so perhaps the heuristics for choosing between execvp and system needs to be updated to handle this incident.

Regards Magnus

mboden commented 9 years ago

Also I think it would be nice if custom macros would be available as environment variables per default. Since there will not be many of them and only for certain hosts I don't think it will impact performance noticably.

Regards Magnus

ageric commented 9 years ago

The idea is to make macros a lot smarter, where you can specify something like $HOST:_some_var?default_value$ and the default_value string is used if $HOST:_some_var$ doesn't actually resolve to anything. That should solve the real problem with your additional request in a generic way that's useful for other things as well, assuming I understand the actual problem, ofcourse

mboden commented 9 years ago

Hello,

Thanks yes that would be a lot better than today with all the environment variables. Personally I think it can be done even better.

So you understand more how I am monitoring stuff today with nagios and with naemon in my test environment.

I want to keep the object configuration as generic as possible.

I have my own plugin for monitoring checkpoint firewalls with this command declaration:

// This is from my current nagios setup since I have the small problem of getting the macros to my process define command { command_name check_cpfw command_line $USER2$/check_snmp_cpfw/check_snmp_cpfw.rb -H $HOSTADDRESS$ $_HOSTSNMP$ --directory /etc/nagios3/conf.d/dynamic/check_snmp_cpfw/ }

define host { use host-template-sc,snmp-sc host_name XXXXXXXXXXXXXX address X.X.X.X hostgroups Customer_XX_Firewall,obj_checkpoint _CPFWPOLICY fw_policy }

Since I use $_HOSTSNMP$ I can use the same command declaration for both snmp v2 and v3 for example. Depending on which snmp settings I "use" the correct host object which also simplifies configuration.

define host { name snmp-sc _SNMP -2 -C XXXXXXXXXXXX register 0 }

define host { name snmp-sc-v3 _SNMP -l XXXXXX -x XXXXXXXXXXXX register 0 }

My plugin checks dynamically with snmp which feature to monitor since they could be different depending on what is configured on the firewall like is it a cluster or not and several other stuff. When certain snmp oids are available my plugin will create new service files in the folder specified with the --directory argument followed by a nagios reload sent as an external command.

These optional custom host macros are checked by my plugin to control certain thresholds. If they are not available the plugins use a hardcoded default. I would like this to be assigned to hostgroups since this would be handy to use a defaults.

_CPFWPOLICY PolicyName Verifies that the firewall has firewall PolicyName installed. _CPFWLEVELS x,y Checks the state table size. > x = Warning, > y = Critical. _CPFWCPULEVELS warn=x,crit=y,duration=z Warning, critical and duration levels for CPU utilization _CPFWMEMLEVELS warn=x,crit=y,duration=z Warning, critical and duration levels for Memory utilization

To me the -c and -w doesn't cut it since there are a lot of different thresholds since I don't write a plugin for a certain service but to monitor all services on the host (I work mostly with networking equipment where everything is available through snmp).

All the created services are defined as passive services and the master Checkpoint service submits the passive results for all of them.

Look at the attached image for an example.

What I think we could do better is to perhaps rename the host macros to properties and make them only be available to the correct plugin and perhaps certain global properties like the ones I use for $_HOSTSNMP$ which would need to be available to all plugins. Perhaps something like if a property is called _CPFW_X then it would only be available to a check command called cpfw for example.

Another feature I would like is for a plugin to be able to know when something has been acknowledged, when I started this plugin I wasn't using livestatus on my nagios installation so my plugin actually parses the status.dat to check if a service has been acknowledged. I use this for example there are certain memory allocation failures counters that can be caused by ddos or portscans that if they happen are not very critical but if they happen all the time should be handled. It is a few counters and I can't check them against a value because it depends on how often they happen not the actual value. My plugin stores the value of these counters in a /tmp/ file and if they change the Memory service will be warning or critical but the only way to reset them is to restart the firewall which is not going to happen so instead when the service is acknowledged the new value is stored and the service turns green. I will instead of parsing status.dat check this through livestatus in the near future.

I will investigate if I can get the host macros through livestatus instead of using environment variables. What is your position on making plugins dependent on livestatus?

What I like about my setup is that my collegues who don't work much with nagios/naemon can just add a host object and if it is in the correct hostgroup everything will be monitored. I think this is the easiest way for my collegues but fairly complex for the admin. If you have any thoughs on how I can make my configuration design better it is appreciated.

I have the same setup for a lot of networking equipment like HP switches, Cisco ACE loadbalancers, Clavister firewalls, Ironport smtp gateways, Meru wireless controllers, Mitel voip and Tippingpoint IPS'es.

The top priority for me is to have the services on each host be dynamic, I don't what to have to specify if feature X and feature Y should be monitored, the plugin should handle that by itself. A little like check-mk but it is dynamic and doesnt require me to run a cli command to update all hosts.

I would very much like to help you guys with naemon, my c coding skills a mediocre at best since I don't do much of it but I am open to suggestion of how I can be of service. I am really interested in monitoring stuff.

Best Regards Magnus

image