This module is a replacement for the puppet_metrics_dashboard module. It is used to configure Telegraf, InfluxDB, and Grafana to collect, store, and display metrics collected from Puppet services. By default, those components are installed on a separate Dashboard node by applying the base class of this module to that node. That class will automatically query PuppetDB for Puppet Infrastructure nodes (Primary server, Compilers, PuppetDB hosts, PostgreSQL hosts) or you can specify them via associated class parameters. It is not recommended to apply the base class of this module to one of your Puppet Infrastructure nodes.
If you are applying either the puppet_operational_dashboards
or puppet_operational_dashboards::telegraf::agent
classes to a node that cannot access the internet, it is possible to install packages from either an internal repository or archive sources located within the air gap.
To use internal repositories, set the following parameters.
For InfluxDB, the following parameters are configurable:
Alternatively, you may set manage_repo to false
to manage the repository configuration yourself.
For Grafana, the equivalent parameters are:
For Telegraf, the following parameters are available:
GPG options are currently not configurable.
Via Hiera it will look like this:
influxdb::manage_repo: true
influxdb::archive_source: false
puppet_operational_dashboards::profile::dashboards::manage_grafana_repo: true
puppet_operational_dashboards::telegraf::agent::manage_archive: false
puppet_operational_dashboards::telegraf::agent::manage_repo: true
That will install all components via their upstream repositories. When you want to use packages but manage repositories on your own, for example because you use Katello or RedHat Satellite:
influxdb::manage_repo: false
influxdb::archive_source: false
puppet_operational_dashboards::profile::dashboards::manage_grafana_repo: false
puppet_operational_dashboards::telegraf::agent::manage_archive: false
puppet_operational_dashboards::telegraf::agent::manage_repo: false
influxdb needs to install the toml gem at the moment. If you're behind an http proxy you can configure that as well:
influxdb::profile::toml::install_options_server:
- '-p'
- 'http://10.88.96.254.3128'
influxdb::profile::toml::install_options_agent:
- '-p'
- 'http://10.88.96.254.3128'
To use archive sources, set the following parameters. Note that support for Grafana archives has been deprecated and will need to be installed from a repository or package source.
For InfluxDB, set:
false
For Grafana, set:
Then, set grafana::package_source to an internal URL containing a package for Grafana for your distribution.
For Telegraf, set:
false
true
To Install on Puppet Enterprise:
puppet_operational_dashboards::enterprise_infrastructure
to a node group that encompasses all Puppet Infrastructure agents. The default node group PE Infrastructure Agent
is appropriate.include puppet_operational_dashboards::enterprise_infrastructure
This will install the toml-rb gem on compiling nodes, and grant the appropriate access to the databases, for the dashboard node on all database nodes.
puppet_operational_dashboards
to the Puppet agent node to be designated as the Operational Dashboard node.include puppet_operational_dashboards
This will install and configure Telegraf, InfluxDB, and Grafana.
Please note database access will not be granted until the Puppet agent run on the postgres nodes AFTER the application of puppet_operational_dashboards
on the designated dashboard node.
The toml-rb gem needs to be installed in the Puppetserver gem space, which can be done with the influxdb::profile::toml class in the InfluxDB module.
To collect PostgreSQL metrics, FOSS users can apply the puppet_operational_dashboards::profile::foss_postgres_access
class to any postgres nodes to configure authentication and grants for a telegraf
user to connect. This class has a dependency on the puppetlabs/puppetdb
and puppetlabs/postgresql
modules, and you must use the puppetlabs/puppetdb
module to configure SSL for postgres. See the documentation here.
You may also configure the connection options used by the Telegraf client when querying postgres. These options can be set using the puppet_operational_dashboards::telegraf::agent::postgres_options
class parameter.
The easiest way to get started using this module is by including the puppet_operational_dashboards
class to install and configure Telegraf, InfluxDB, and Grafana. Note that you also need to install the toml-rb gem according to the documentation.
include puppet_operational_dashboards
Installing the module will:
puppet_operational_dashboards::telegraf::agent
parameters.Note that this will save an InfluxDB administrative token to the user's home directory, typically /root/.influxdb_token
. The puppetlabs/influxdb
types and providers can make use of this file during catalog application. The manifests in this module are also able to use it via deferred functions, which also run on the agent as the first step of catalog application. Therefore, it is possible to use this file for all token-based operations in this module, and no further configuration is required.
It is also possible to specify this token via the influxdb::token
parameter in hiera. The Telegraf token used by the telegraf
service and Grafana datasource can also be set via puppet_operational_dashboards::telegraf_token
. These are both Sensitive
strings, so the recommended way to use them is to encrypt them with hiera-eyaml and use the encrypted value in hiera data. After setting up a hierarchy to use the eyaml backend, the values can be added to hiera data and automatically converted to Sensitive
:
influxdb::token: <eyaml_encrypted_string>
lookup_options:
influxdb::token:
convert_to: "Sensitive"
These parameters take precedence over the file on disk if both are specified.
To access Grafana, use the following in your browser of choice:
http://[AGENT IP ADDRESS]:3000
When first accessing Grafana, the default login and password are as follows:
Username: admin Password: admin
Upon first sign in, you will then be prompted to change your password, skip this step and proceed to the dashboard.
When using the default configuration options and the deferred function to retreive the Telegraf token, note that it will not be available during the initial Puppet agent run that creates all of the resources. A second run is required to retrieve the token and update the resources that use it. If you are seeing authentication errors from Telegraf and Grafana, make sure the Puppet agent has been run twice and that the token has made its way to the Telegraf service config file:
/etc/systemd/system/telegraf.service.d/override.conf
Which hosts a node collects metrics from is determined by the puppet_operational_dashboards::telegraf::agent::collection_method
parameter. By default, the puppet_operational_dashboards
class will collect metrics from all nodes in a PE infrastructure. If you want to change this behavior, set collection_method
to local
or none
. Telegraf can be run on other nodes by applying the puppet_operational_dashboards::telegraf::agent
class to them, for example:
class {'puppet_operational_dashboards::telegraf::agent':
collection_method => 'local',
token => <my_sensitive_token>,
}
Metrics archives output by the Puppet metrics collector can be imported into InfluxDB using Telegraf and the scripts in the examples/
directory. See ARCHIVES.md
for more.
This dashboard is to inspect Puppet server performance and troubleshoot the pe-puppetserver
service. Available panels:
Use Case
This dashboard is to inspect File-sync related performance. Available Graphs:
Use Case
This dashboard is to inspect PuppetDB performance and troubleshoot the pe-puppetdb
service. Available panels:
Use Case
This dashboard is to inspect PostgreSQL database performance. Available panels:
Use Cases
Currently, only the latest Telegraf package is provided by the Ubuntu repository. Therefore, the only allowed value for puppet_operational_dashboards::telegraf::agent::version
is latest
. Setting this parameter to a different value on Ubuntu will produce a warning.
This module uses InfluxDB 2.x, while puppet_metrics_dashboard
uses 1.x. This module does not currently provide an option to upgrade between these versions, so it is recommended to either install this module on a new node or manually upgrade. See the InfluxDB docs for more information about upgrading.
On Puppet Enterprise versions 2021.5 and 2021.6, there is an issue when applying either the puppet_operational_dashboards::enterprise_infrastructure
or puppet_operational_dashboards::profile::postgres_access
classes in a user manifest. Doing so may result in an error such as:
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Comparison of: Undef Value < Integer, is not possible. Caused by 'Only Strings, Numbers, Timespans, Timestamps, and Versions are comparable'
This is due to an ordering issue with the cert_allowlist_entry
defined type. The workaround is to apply the classes via the Console, for example by applying puppet_operational_dashboards::enterprise_infrastructure
to the PE Infrastructure Agent
node group. See Installing on Puppet Enterprise.
This issue only affect PE versions 2021.5 and 2021.6. Earlier versions are not affected, and later releases will include a fix to the defined type.
On some versions of openSUSE 15, the insserv-compat
package may be required to enable the Grafana service. If you see an error such as:
Error: /Stage[main]/Grafana::Service/Service[grafana]/ensure: change from 'stopped' to 'running' failed: Could not enable grafana-server:
This is due to the missing package:
Synchronizing state of grafana-server.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
Executing: /usr/lib/systemd/systemd-sysv-install enable grafana-server
/sbin/insserv: No such file or directory
Installing the insserv-compat
resolves the error.
If data is not displaying in Grafana or you see errors in Telegraf collections, try checking the following items.
A common reason for not seeing data in the dashboards is choosing the wrong datasource or time interval. Double check that you have selected a datasource and window of time for which metrics have been collected. Also, check that the server
filter at the top of the dashboard contains valid entries.
Also, note that Telegraf performs its first collection after the first collection interval has passed. You may need to wait for this to pass, or manually test using the method below.
Datasources can be tested via the "Data Sources" configuration page in Grafana. Select the datasource, e.g. influxdb_puppet
, and click the "Test" button. Note that because this is a "provisioned datasource," it cannot be edited in the UI.
A good way to test Telegraf collection is to use the --test
option. After logging into the node running telegraf
, first export your token:
export INFLUX_TOKEN=<token>
The token can either be the admin token written to /root/.influxdb_token
by default, or the puppet telegraf token
used specifically for Telegraf. See REFERENCE.md
for more information.
Prepending a space before the export
command will prevent the token from being written to you shell's history.
Then, test the collection:
telegraf --test --debug --config /etc/telegraf/telegraf.conf --config-directory /etc/telegraf/telegraf.d/
Services can also be tested individually, for example:
telegraf --test --debug --config /etc/telegraf/telegraf.conf --config /etc/telegraf/telegraf.d/puppetserver_metrics.conf
will only collect Puppet server metrics.
The Support Knowledge base is a searchable repository for technical information and how-to guides for all Puppet products.
This Module has the following specific Article(s) available:
The Support Video Playlist is a resource of content generated by the support team
This Module has the following specific video content available: