Perfservmon is a Nagios Plugin for IBM Websphere Application Server(WAS) using the perfservlet web application that comes with each WAS installation. It also has minimal library dependencies so it can be easily used on most environments.
The plugin can monitor the following WAS metrics of a WebSphere Cell:
<WAS_ROOT>/installableApps
, i.e. this would be in /opt/IBM/WebSphere/AppServer/installableApps
in a Unix System.The plugin is tested to work with WAS Traditional version 8.5 and 9.0.
Copy the perfservmon.py
file in $USER1$
path, which is the plugins path. You will propably find the value of this variable in Nagios resource.cfg
file (usually this is a libexec
directory).
Add the following lines in Nagios command.cfg
file:
#Check_perfservlet commands
#The -H -u -p parameters are optional
#depending on whether you use https and/or Basic Auth credentials to access the perfservlet
define command{
command_name check_perfserv_retriever
command_line $USER1$/perfservmon.py -C $ARG1$ retrieve -N $ARG2$ -P $ARG3$ -H $ARG4$ -u $ARG5$ -p $ARG6$
}
define command{
command_name check_perfserv_show
command_line $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M $ARG4$ -c $ARG5$ -w $ARG6$
}
define command{
command_name check_perfserv_show_dcp
command_line $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M DBConnectionPoolPercentUsed -j $ARG4$ -c $ARG5$ -w $ARG6$
}
define command{
command_name check_perfserv_show_sib
command_line $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M SIBDestinations -d $ARG4$ -c $ARG5$ -w $ARG6$
}
Before defining a service using check_perfserv_show it is required to add the following service definition at the WAS Server or the DMgr Server(for ND Architecture) Nagios Config file:
define service{
use local-service
host_name <WAS_Host>
service_description Collect PerfServlet data from Cell
check_command check_perfserv_retriever!<WAS_Cell_Name>!<PerfServ_hostname>!<PerfServ_Port>![http|https]!userid!passwd![--ignorecert]
}
Where:
PerfServ_Port = The Port of where perfservlet Application runs
Optionally set the HTTP protocol(http or https) and/or the Basic Authentication credentials for accessing the PerfServlet Application. In the case of an https connection you may use (although not recommended) the --ignorecert option to ignore any TLS certificate issues.
This is the check that collects all the relevant perfserv data of all nodes/servers from perfservlet and stores them localy as a Python selve file.
In case you want, for example, to change the check interval of the above service so that all WAS data are refreshed more frequently you may add the following lines in Nagios template.cfg:
define service{
name collector-service ; The name of this service template
use local-service ; Inherit default values from the local-service definition
max_check_attempts 2 ; Re-check the service up to 2 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 3 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
Then the collector service definition should be like the following:
define service{
use collector-service
host_name <WAS_Host>
service_description Collect PerfServlet data from Cell
check_command check_perfserv_retriever!<WAS_Cell_Name>!<PerfServ_hostname>!<PerfServ_Port>![http|https]!userid!passwd
}
define service{
use local-service
host_name <WAS_Host>
service_description WAS Heap usage
check_command check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!Heap!<Critical Percentage>!<Warning Percentage>
}
define service{
use local-service
host_name <WAS_Host>
service_description WAS WebContainer ThreadPool Usage
check_command check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!WebContainer!<Critical Percentage>!<Warning Percentage>
}
Shows all the available connection pools of the WAS Server and show an alert when any of them exceeds the percentage limits.
define service{
use local-service
host_name <WAS_Host>
service_description WAS ConnectionPool Usage
check_command check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!DBConnectionPoolPercentUsed!<Critical Percentage>!<Warning Percentage>
}
Shows a specific connection pool of the WAS Server (specified with
define service{
use local-service
host_name <WAS_Host>
service_description WAS ConnectionPool JNDI_name Usage
check_command check_perfserv_show_dcp!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!<JNDI_name>!<Critical Percentage>!<Warning Percentage>
}
Shows the Total Live HTTP Sessions together with the individual(per Module HTTP Sessions). Show an alert when the Total Sessions exceed the limits.
define service{
use local-service
host_name <WAS_Host>
service_description WAS Http Live Sessions
check_command check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!LiveSessions!<Critical No of Sessions>!<Warning No of Sessions>
}
define service{
use local-service
host_name <WAS_Host>
service_description WAS ORB ThreadPool Usage
check_command check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!ORB!<Critical Percentage>!<Warning Percentage>
}
define service{
use local-service
host_name <WAS_Host>
service_description My Topic Space
check_command check_perfserv_show_sib!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!<MyTopicSpaceName>!<No_Messages_Critical>!<No_Messages_Warning>
}
define service{
use local-service
host_name <WAS_Host>
service_description My Exception Destination
check_command check_perfserv_show_sib!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!_SYSTEM.Exception.Destination.<WAS_Node_Name>.<WAS_server_name>-<SIBus_Name>!<No_Messages_Critical>!<No_Messages_Warning>
}