Open bastelfreak opened 2 years ago
Thanks, we're currently working on adding more meaningful Orchestrator metrics in PE, so we'll consider this when we build out a dashboard for those. It may be worth making a dashboard with what is already available, such as this one and JVM metrics.
This may be part of #148
This particular mbean, puppetlabs.pcp.connections
, looks to be the only one not exposed by the /status
api, so it's not currently available. We would have to add another set of config to do it, but since it would be a useful metric I'll leave this issue open.
Yes, and the metric seems to be only available on compilers. Did not check a standard installation.
I think it's available on the primary via Orchestrator's api, so if you replaced 8140
with 8143
in the example it should work. That reminds me that we missed collecting PCP metrics on compilers in #148 because they come from Puppet server, but we do have a ticket for that.
Is this feature already implemented in the orchestration dashboard?
We don't currently have puppetlabs.pcp.connections
since it's not accessible via the /status
api. We'd need to add a new puppet_operational_dashboards::telegraf::config
type and have it POST
to the /metrics/v2
api similar to how the PDB metrics do here.
For compiler PCP metrics, we need to do something like
orchestrator
and puppetserver
measurementsOne concern might be if the broker service metrics don't exist on OSP, I'm not sure if they do. If you define a path
on a json_v2.object
that doesn't exist, the whole collection will fail. We could work around that by adding the fields to included_keys
instead, which will just skip them if they aren't available.
The broker metrics don't exist on open source puppetserver. Broker and orchestrator are only bundled/provided in PE.
PCP metrics are only collected within if $collection_method == 'all' {
and not within elsif $collection_method == 'local' {
Hi,
I raised two PRs, #178 and #179 . I think at the moment the PCP metric collection doesn't work at all. The variable in the agent.pp is wrong, as mentioned in my comment above, also $puppet_operational_dashboards::include_pe_metrics
isn't known in the agent.pp. I fixed this and reworked the tests to have a better coverage. in #179 I fixed the collection of PCP metrics when telegraf is collecting locally. I also added a unit tests to validate the setup at a PE customer I know.
I also see some problems with https://github.com/puppetlabs/puppet_operational_dashboards/pull/168 and https://github.com/puppetlabs/puppet_operational_dashboards/pull/167. https://github.com/puppetlabs/puppet_operational_dashboards/pull/167/commits/3876b7905a62f89a12430ee8419774667a3349d5#diff-57ccecc0a9208b20f3e973538fdde86ccbb9a774ddeca8bca6f5737aa31ddfa6R440 added the PCP connection counter. That got deleted in https://github.com/puppetlabs/puppet_operational_dashboards/pull/168/files#diff-57ccecc0a9208b20f3e973538fdde86ccbb9a774ddeca8bca6f5737aa31ddfa6L440 and I think that was by accident?
I mentioned that already somewhere else: both PRs are quite noisy. They contain a lot of changes and no tests, they were hard to review. They were merged quite fast and I had no time to respond in time.
Yes, removing the panels was my mistake. Testing and merging multiple features that change the dashboards is a bit error-prone because managing a large JSON file is difficult. Maybe we could use something like the public dashboards feature, but it's currently in preview and not enabled by default.
@m0dular can you take a look at this please? The dashboard added in https://github.com/puppetlabs/puppet_operational_dashboards/pull/167 is still missing in the latest release :(
Sorry about the delay; I opened #199 to add it back. I checked that all the config to capture the metric was still there, and the new dashboard version displays them correctly for me. The name of the mbean
is default:name=puppetlabs.pcp.connections
, so it may not have actually been correct in the previous version. You should be able to create a panel in your existing dashboard and look for the value just to confirm.
@bastelfreak Does this update work for you?
Hey. Sorry, was vacationing and will test on Monday.
Use Case
I'm currently debugging pcp-agent<>puppetserver connection problems with Puppet Engineers for a customer. I noticed that there is a metric for the amount of open PCP connections, but no related dashboard. Having that would help a lot to identify overloaded puppetservers / uneven loadbalancing of PCP connections. I used:
Describe the Solution You Would Like
A dashboard that tracks a connection counter over time
Describe Alternatives You've Considered
Customer has other monitoring soltutions where we could implement this, but I think such a panel would be really helpful for other users as well.
Additional Context
I'm not 100% sure if
default:name=puppetlabs.pcp.connections
gives the amount of current open connections, but I think so. I didn't find any good documentation about the JMX/Jolokia metrics