open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
256 stars 165 forks source link

[cicd] embed system attributes under cicd runners #1184

Open adrielp opened 3 months ago

adrielp commented 3 months ago

Overview

Update the semantic conventions for CI/CD pipeline runners to embed system attributes once the embed feature is added.

do I understand that here we basically need to embed other attributes from os namespace? there is an open PR to do this, maybe you can create a dedicated issue out of it and postpone until it will be used into semconv?

_Originally posted by @trisch-me in https://github.com/open-telemetry/semantic-conventions/pull/1075#discussion_r1654395856_

christophe-kamphaus-jemmic commented 3 weeks ago

The goal of this issue is to have the capability to link Otel signals emitted by a runner (eg. host metrics, logs, events) to any jobs using a given runner.

Would this mean embedding system metrics under the cicd namespace or to add some cicd attributes (eg. cicd.pipeline.run.id) to the system metrics ?

christophe-kamphaus-jemmic commented 2 weeks ago

Related to #1111


SemConv meeting notes 2024-09-16

How could we make the link between a cicd run and metrics emitted by the runner of that run?

This could be a question related to the Entity Group. How would this link be expressed?

Using resource attribute will work currently: "Okay, the idea behind resource. One of the ideas is that all the telemetry generated from a particular, you know thing component would have the same set of attributes in it. So you can use that to tie, tie the knot and understand. This is the same source."

Using another way to link metrics and cicd run (eg. in an event) is not currently possible. The Entity SIG will be working on answering that question.

Would this mean embedding system metrics under the cicd namespace or to add some cicd attributes (eg. cicd.pipeline.run.id) to the system metrics ?

In general, it's better to just use the existing metrics without copying / embedding them into the cicd namespace.

The exception would be if we wanted to measure something that could not be expressed using the existing metrics and that relates to cicd, then we could think of embedding a metric in cicd namespace.


Using resource attributes is how I have implemented the link between cicd run and runner metrics in https://github.com/jenkinsci/opentelemetry-agent-metrics-plugin.

Downsides to this approach are that

christophe-kamphaus-jemmic commented 2 weeks ago

Conceptually the link between cicd run and cicd runner is many-to-many:

The runners can be static or ephemeral / auto-scaling.

christophe-kamphaus-jemmic commented 1 day ago

We should define cicd resource semconv. To be able to cover the *-to-1 runner case where several jobs run on the same runner, we could make the type of cicd.pipeline.run.id string[]. :question: Can we dynamically update resource attributes (eg using resource detection)? Or would that require the restart of node_exporter for example?

christophe-kamphaus-jemmic commented 1 day ago

Can we dynamically update resource attributes (eg using resource detection)?

https://github.com/open-telemetry/opentelemetry-specification/blob/v1.35.0/specification/resource/sdk.md mentions that Resource is immutable.

Or would that require the restart of node_exporter for example?

Most likely. Might this change with the Entity changes?