Arbitrary set of preconfigured docker images to quickly build your monitoring stack.
Grafana js based dashboards expect to reach graphite directly through:
for debugging you can use: $(boot2docker ip).xip.io:8080.
Container that acts as a single monitoring endpoint for your application while shipping data it allows you to:
Ports:
Example options:
TODO: Replace logstash with a lightweight implementation (small golang process).
Listens for logs (pulls logs from redis queue) and ships them to elasticsearch:9200.
Further log parsing will be added here at the moment we are also generating statsd metrics on incoming logs.
There are three environment variables which can be used to point logstash at elasticsearch and use SSL (or plain text) for transmission:
OPG_LOGSTASH_ELASTICSEARCH_HOSTPORT
specifies the host:port
combination to define the elasticsearch host/port. Default is elasticsearch:9200
. This has been tested pointing to an AWS Elasticsearch cluster (using the AWS ES service).
OPG_LOGSTASH_ELASTICSEARCH_SSL_ENABLED
is a boolean value specifying if transport to elasticsearch should be over SSL. Default is false
.
OPG_LOGSTASH_ELASTICSEARCH_SSL_CERTIFICATE_VERIFICATION
is a boolean value specifying if the server certificate should be validated or not. Default is false
(as SSL default is false
to preserve backwards compatibility with existing setup where transport is plain text).
A pre-configured graphite.
A pre-configured grafana with autogenerating dashboards:
https://grafana.(domainname)/dashboard/script/overview.js?env=(metrics_prefix)
Collectd configured to feed metrics to a carbon collector on port 2003. For more information on collectd see https://collectd.org/.
This implementation of collectd comes with a few basic checks, as well as a custom btrfs check from https://github.com/soellman/docker-collectd
Demo web app container (to demonstrate use of monitoring proxy).
The defautl log_level is 'warn'. To change this use the following environment variable:
Base checks are designed to go into the base.sls
pillar so applied to everything.
Environment variables (mandatory for each check):
where checkname
is the name of the check.
Environment variables (optional for each check):
where checkname
is the name of the check.
In addition to the ability to fully disable the handling of certain checks, Sensu supports ‘subduing’ checks so that they are not handled during certain hours of the day. This is done by configuring the "begin" and "end" times for the check’s "subdue" attribute.
e.g.
NOTE: BOTH must be supplied (BEGIN and END) or neither.
If you want to set a global subdue window for all checks, you can use the following variables:
NOTE: BOTH must be supplied (BEGIN and END) or neither.
If none of the subdue variables above are set, the default will be that NO SUBDUE window is defined.
An individual check based subdue window will override a global subdue window which will override the default.
For more information on Sensu checks please refer to https://sensuapp.org/docs/latest/checks
Role based checks are designed to go into a role
pillar so used only on a set of servers in a particular role.
The variables for role checks are the same as those described above with the BASE
in the variable name replaced with ROLE
e.g.
The opg-check-elasticrecent.sh
plugin will check whether there are particular Elasticsearch index entries for a specified index within a recent time period. The ELASTIC_INDEX
variable should be set to the index to check. The default is logstash-$(date '+%Y.%m.%d')
. The time period can be specified with the SENSU_CLIENT_ELASTIC_RECENT
variable (see below). The default for this variable is 5m
and should follow the standard format for Elasticsearch Query DSL filters.
The opg-check-functions.sh
is designed to contain reusable functions that can be called by other plugins. e.g. the curl_and_return
function can be used to return the output from a curl command back to the caller. Any functions that can be reused should be added to this plugin and should be defined with the function
statement so that they are exported correctly for child processes to call on.
The opg-check-wrapper.sh
is designed to be used to invoke both community and in-house developed plugins. It sets up a number of variables that can be used by in-house developed plugins based on environment variables passed into the docker container at startup by an env file:
The default values plus an explanation of each variable can be found in the comments within the wrapper plugin. Any additional variables used by newly developed plugins should be setup with defaults in the same way. Other variables passed in via docker environment files can be passed straight to community plugins.
The wrapper also sets up a variable DEFAULT_HOST
which is the IP address of the default gateway (which inside docker is the method for gaining access to ports exposed by other containers to the host).
To invoke a plugin via the wrapper simply set OPG_CHECK
to whatever should be run to initiate the check and the wrapper will execute and return the exit code from it back to Sensu. Therefore in-house written checks should follow the standard exit codes for Sensu as desribed at https://sensuapp.org/docs/latest/checks
The following are examples of pillar defining checks that use the wrapper (both in-house developed as well as community plugin checks):
SENSU_CLIENT_CHECKS_ROLE_ESRECENT_NAME: esrecent
SENSU_CLIENT_CHECKS_ROLE_ESRECENT_COMMAND: "export ELASTIC_INDEX='.marvel-$(date +%Y.%m.%d)' ; export SENSU_CLIENT_ELASTIC_RECENT=20m ; export OPG_CHECK='/etc/sensu/plugins/opg-check-elasticrecent.sh' ; /etc/sensu/plugins/opg-check-wrapper.sh"
SENSU_CLIENT_CHECKS_ROLE_REDIS_NAME: redis
SENSU_CLIENT_CHECKS_ROLE_REDIS_COMMAND: "export OPG_CHECK='/opt/sensu/embedded/bin/check-redis-ping.rb --host $DEFAULT_HOST --port $SENSU_CLIENT_REDIS_PORT' ; /etc/sensu/plugins/opg-check-wrapper.sh"
SENSU_CLIENT_CHECKS_ROLE_KIBANA_NAME: kibana
SENSU_CLIENT_CHECKS_ROLE_KIBANA_COMMAND: "export OPG_CHECK='/opt/sensu/embedded/bin/check-http.rb --url http://$DEFAULT_HOST:$SENSU_CLIENT_KIBANA_PORT' ; /etc/sensu/plugins/opg-check-wrapper.sh"
SENSU_CLIENT_CHECKS_ROLE_ELASTIC_NAME: elastic
SENSU_CLIENT_CHECKS_ROLE_ELASTIC_COMMAND: "export OPG_CHECK='/opt/sensu/embedded/bin/check-http.rb --url http://$DEFAULT_HOST:$SENSU_CLIENT_ELASTIC_PORT' ; /etc/sensu/plugins/opg-check-wrapper.sh"
SENSU_CLIENT_CHECKS_ROLE_ESHEAP_NAME: esheap
SENSU_CLIENT_CHECKS_ROLE_ESHEAP_COMMAND: "export OPG_CHECK='/opt/sensu/embedded/bin/check-es-heap.rb --host $DEFAULT_HOST --port $SENSU_CLIENT_ELASTIC_PORT -P -w 80 -c 90' ; /etc/sensu/plugins/opg-check-wrapper.sh"
SENSU_CLIENT_CHECKS_ROLE_ESNODE_NAME: esnode
SENSU_CLIENT_CHECKS_ROLE_ESNODE_COMMAND: "export OPG_CHECK='/opt/sensu/embedded/bin/check-es-node-status.rb --host $DEFAULT_HOST --port $SENSU_CLIENT_ELASTIC_PORT' ; /etc/sensu/plugins/opg-check-wrapper.sh"
Environment variables with defaults:
Environment variables for the PagerDuty handler:
Environment variables for the Slack handler:
Environment variables for the AWS SNS handler:
OPG Uchiwa was originally designed to run alongside a local Sensu API service on the same host (in the same docker-compose) called sensuapi. This was achieved by setting a number of environment variables for docker that defined the single end point in the /etc/sensu/uchiwa.json
configuration file (see sample env
file in this repository) and adding a link
in the docker-compose file.
This method has been preserved for backwards compatibility so that existing environments can run unchanged with the version that supports multiple end points. If none are defined it reverts to the default config.
To configure multiple Sensu API end points (Data Centers) you need to define how many and then define each one using the following variables:
UCHIWA_SENSU_MULTIPLE (number of data centres minus one - indexing starts from zero)
UCHIWA_SENSU_DC_name_NAME
UCHIWA_SENSU_DC_name_HOST
UCHIWA_SENSU_DC_name_PORT
UCHIWA_SENSU_DC_name_SSL
UCHIWA_SENSU_DC_name_INSECURE
UCHIWA_SENSU_DC_name_USER
UCHIWA_SENSU_DC_name_PASS
UCHIWA_SENSU_DC_name_TIMEOUT
where name
is a unique name for each data center defined. The number defined should match UCHIWA_SENSU_MULTIPLE
minus one (as the index of elements begins at zero`).
It is VERY IMPORTANT that UCHIWA_SENSU_MULTIPLE
accurately reflects the actual number of Data Centers defined and that each is fully defined. Anything not matching is likely to result in Uchiwa failing to start (either through confd errors or invalid json rendered).
See the env
file in this repository for examples of multiple data centers.
This monitoring stack is designed to be run alongside the opg-core app stack (https://github.com/ministryofjustice/opg-core-docker) under a local boot2docker
or native docker installation.
To bring up the stack:
$ cd /path/to/opg-docker-monitoring-repo
$ docker-compose up -d
Using your docker/boot2docker host IP:
➜ ~ boot2docker ip
192.168.59.103
➜ ~
If running the opg-core-docker app stack at the same time:
URLs based on xip.io
: