DataDog / integrations-core

Core integrations of the Datadog Agent
BSD 3-Clause "New" or "Revised" License
919 stars 1.39k forks source link

Fix cockroachdb metric names. #17605

Open jmcarp opened 4 months ago

jmcarp commented 4 months ago

Related to #17600, I see that some of the metric names for the cockroachdb integration are arguably wrong. For example, cockroachdb includes a metric called admission.wait_queue_length.kv. This name makes sense, because admission is a cockroachdb subsystem, and wait_queue_length is a property of that system, and kv is a subcategory of wait_queue_length. However, the Datadog integration renames this metric to admission.wait.queue.length.kv. This doesn't make sense to me: wait, queue, and length aren't nested concepts, they're a single concept called wait_queue_length. More to the point, the cockroachdb authors gave that metric its name for a reason, and I would think that Datadog should generally respect upstream naming conventions.

I would be happy to send a PR to fix the cockroachdb metric names, but I think the real solution is to check in the code that generates the metric names, so that we can consistently update metrics with new cockroachdb releases. Then if metric names need to be changed, we can fix them at the level of the automation code and not review individual metric names by hand.

@FlorentClarret @sudomateo

FlorentClarret commented 4 months ago

Hello @jmcarp, thanks for reaching out.

I don't know what was the reasoning behind this naming and I agree with you. I pinged the team in charge of this integration so they can have a look!

sudomateo commented 4 months ago

Thank you! We'll also send a support ticket so we have something that follows the official support process. In my opinion Datadog should respect the metric names that the application emits since that's what the application documents and what operators and developers expect.

It's fine for Datadog to add a prefix to the metrics that links the metric to the integration and prevent naming collisions with other integrations but that's the only mutation that should occur.

I'm proposing that Datadog update this integration to emit metrics in the format PREFIX.METRIC_NAME where PREFIX is the integration name (i.e, cockroachdb) and METRIC_NAME is the metric name as emitted from the application (i.e., admission.wait_queue_length.kv).

CockroachDB generates its metrics for the public documentation website using the file https://github.com/cockroachdb/cockroach/blob/master/docs/generated/metrics/metrics.html. We can use that to build the correct metrics naming.