open-telemetry / oteps

OpenTelemetry Enhancement Proposals
https://opentelemetry.io
Apache License 2.0
326 stars 157 forks source link

OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability #223

Closed horovits closed 4 months ago

horovits commented 1 year ago

This is PR is a new OTEP for CI/CD Observability

horovits commented 1 year ago

I brought up this is a topic at the project discussion at KubeCon NA, good feedback on that discussion. now I formalized the proposal and put it as an OTEP, curious for your thoughts. @alolita @dyladan and all

bdarfler commented 1 year ago

What I'm missing from this proposal is the current state of other CID integrations which already support OTEL.

This is one player in the space https://github.com/inception-health/otel-export-trace-action

reyang commented 1 year ago

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

horovits commented 1 year ago

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it. HTH clarifing.

reyang commented 1 year ago

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it. HTH clarifing.

Sorry, this is still not clear to me - what exactly does "extend OTel to support CI/CD use cases" mean? What is needed/missing (e.g. do we need extra API? do we need specific semantic conventions) from OTel?

horovits commented 1 year ago

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it. HTH clarifing.

Sorry, this is still not clear to me - what exactly does "extend OTel to support CI/CD use cases" mean? What is needed/missing (e.g. do we need extra API? do we need specific semantic conventions) from OTel?

sorry I might miss your question, but I tried to elaborate on that in the 'Internal details' section of the proposal: "OpenTelemetry specification should be enhanced to cover semantics relevant to pipelines, such as the branch, build, step (ID, duration, status), commit SHA (or other UUID), run (type, status, duration). In addition, distribution execution mechanism also introduces various entities, such as nodes, queues, jobs and executors (using the Jenkins terms, other tools having respective equivalents, which the specification should abstract with the semantic convention)."

can you elaborate what you find missing in the above , so I can try and answer?

reyang commented 1 year ago

sorry I might miss your question, but I tried to elaborate on that in the 'Internal details' section of the proposal: "OpenTelemetry specification should be enhanced to cover semantics relevant to pipelines, such as the branch, build, step (ID, duration, status), commit SHA (or other UUID), run (type, status, duration). In addition, distribution execution mechanism also introduces various entities, such as nodes, queues, jobs and executors (using the Jenkins terms, other tools having respective equivalents, which the specification should abstract with the semantic convention)."

can you elaborate what you find missing in the above , so I can try and answer?

Now I understand, thanks! I was trying to see how could TC help since this PR seems to be stuck / not receiving much attentions.

I personally would suggest:

  1. Change the PR title to "Introduce semantic conventions for CI/CD observability".
  2. Socialize with the semantic conventions working group https://github.com/open-telemetry/community#specification-sigs.
  3. Raise awareness during the weekly spec SIG meeting.
dsotirakis commented 1 year ago

👋 Hello all! I am also putting this article How we reduced flaky tests using Grafana, Prometheus, Grafana Loki, and Drone CI to show how we can avoid storing logs and metrics only to query them when the pipelines have finished using PromQL or LogQL, by using OTel.

joshgav commented 1 year ago

A similar effort focused on measuring deliveries in particular:

cc @thisthat @AloisReitbauer

oleg-nenashev commented 1 year ago

I definitely support aligning the CDEvents format and OpenTelemetry specification. For what it's worth, there is also a prototype Backstage plugin (private source ATM) that uses the same format for visualization (JSON. So it would be a net win if we could get it landed

adrielp commented 7 months ago

Looking to revive this OTEP. It looks to have been a while since there's been any traction (though I just found a new comment from a couple weeks ago). I'd like to know what needs to be done to get this moved forward & in? Brought this up att the SIG meeting today for specification and the overall thought was to bring the discussion back here & potentially in the WG.

thisthat commented 7 months ago

Hey @adrielp, I am also interested in this OTEP and would like to help move this forward :)

adrielp commented 7 months ago

awesome @thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD).

Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there.

Also pulled up with @horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular.

thisthat commented 7 months ago

Please, keep me posted @adrielp I am more than happy to join and help the WG :)

Elfo404 commented 7 months ago

@adrielp same here!

dsotirakis commented 7 months ago

@adrielp please include me as well, happy to be a part of it!

mhausenblas commented 7 months ago

Count me in!

afrittoli commented 7 months ago

Thanks Adriel,

Count me in for the working group. I definitely hope we can collaborate with the CDEvents (https://cdevents.dev) project as well.

Andrea

On Thu, 23 Nov 2023 at 14:15, Adriel Perkins @.***> wrote:

awesome @thisthat https://github.com/thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD).

Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there.

Also pulled up with @horovits https://github.com/horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular.

— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/oteps/pull/223#issuecomment-1824509379, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2PKBJSFYNX7G7AI7E333YF5LBZAVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRUGUYDSMZXHE . You are receiving this because you commented.Message ID: @.***>

afrittoli commented 7 months ago

+## Open questions

+ +Open questions include: +- Which entity model should be supported to best represent CI/CD domain and pipelines? +- What are the common CI/CD workflows we aim to support?

+Tekton

+1 (I'm a bit biased as a Tekton maintainer :D)

On Tekton side we have a few relevant features:

So, we definitely care about observability in Tekton

Andrea

— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/oteps/pull/223#discussion_r1403761075, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2PKDILHE2XPRLVHOLJJTYF7CM7AVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTONBXGA2DKNJTG4 . You are receiving this because you commented.Message ID: @.***>

krzko commented 7 months ago

Would love to help with this as finalising the work on New component: Github Actions Receiver now and https://github.com/krzko/run-with-telemetry to provide telemetry for GitHub Actions.

horovits commented 7 months ago

Thanks Adriel, Count me in for the working group. I definitely hope we can collaborate with the CDEvents (https://cdevents.dev) project as well. Andrea … On Thu, 23 Nov 2023 at 14:15, Adriel Perkins @.> wrote: awesome @thisthat https://github.com/thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD). Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there. Also pulled up with @horovits https://github.com/horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular. — Reply to this email directly, view it on GitHub <#223 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2PKBJSFYNX7G7AI7E333YF5LBZAVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRUGUYDSMZXHE . You are receiving this because you commented.Message ID: @.>

@afrittoli yes we should carry on the discussion we started with the CDEvents as to potential collaboration between the projects. some concerns have been raised on your team last time, so we should map carefully if there's an overlap between OTel and CDEvents and the fit.

horovits commented 7 months ago

Would love to help with this as finalising the work on New component: Github Actions Receiver now and https://github.com/krzko/run-with-telemetry to provide telemetry for GitHub Actions.

@krzko Congrats on completing the work on GitHub Actions! the insights of your experimentation will be valuable here.

horovits commented 7 months ago

awesome @thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD).

Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there.

Also pulled up with @horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular.

we'd like to hold a call to share with everyone the work to formalize a working group, and to see who's interested in getting involved as we figure out staffing requirements. I put a vote for the date on our CNCF slack channel cicd-o11y (if you're not yet there - do join) https://cloud-native.slack.com/archives/C0598R66XAP/p1701100511547279 currently date is 7 Dec, 1pm CET / 7am ET join us on slack and I'll update with details and link.

adrielp commented 7 months ago

The PR has been opened to create the CI/CD Observability Working Group

Without a doubt, it's rough, but it's ready to be read, commented on, and discussed in the upcoming meeting. Staffing of course is one of those hot topics. 😄 🚀

adrielp commented 5 months ago

@mhausenblas @afrittoli - I just realized coming back here that I missed your names in the working group PR. Sorry about that 😞

I've added you now, please feel free to check it out and make sure I got it right! Thanks! https://github.com/open-telemetry/community/pull/1822/files#diff-41c277076e06d5ea84d2e8bc9eded2bc97e7f0888502f4f8d691b6c5c3639e57

horovits commented 5 months ago

Important update: we got approval of the TC to establish the CI/CD Observability SIG. The mandate of the new SIG will be to execute on the above OTEP. See here on the SIG scope and approval: https://github.com/open-telemetry/community/pull/1822#issuecomment-1898876452

carlosalberto commented 4 months ago

@horovits Any chance to address/discuss/answer to the comments to the PR? I will do another full review once that is done.

adrielp commented 4 months ago

@carlosalberto @horovits - just to provide an update on this. We discussed this OTEP on the SemConv meeting last week. We actually might not need to proceed directly with this OTEPs. Based on current direction, OTEPs are for SPEC changes, and right now our focus is the Semantic Conventions changes. We're reviewing the CDEvents work right now and coming up with a data model. Once we do that we'll be directly contributing to the Semantic Conventions through pull requests. If we need to make any future specification changes, we may leverage this OTEP or make new ones (that are smaller) to account for those changes.

But as of now, this OTEP isn't directly needed to be moved forward, just provide larger visibility to the efforts and context until there are spec changes. cc. @jsuereth

carlosalberto commented 4 months ago

Thanks for the follow up! Should we then close this OTEP? We can always find it later on if/as needed.

adrielp commented 4 months ago

@carlosalberto I'm fine with closing it if that's how y'all want to handle it. Based on the conversations, I think it makes logical sense given the direction we're headed. @horovits - any objections?

carlosalberto commented 4 months ago

Hey @horovits Any concern?

horovits commented 4 months ago

Hey @horovits Any concern?

@carlosalberto @adrielp sure let's follow the process of the Semantic Conventions team, if this OTEP is no longer required then let's close it.

horovits commented 4 months ago

closing the OTEP PR per the feedback from @carlosalberto @adrielp @jsuereth and the OpenTelemetry Semantic Conventions WG. This doesn't mean we're backing off the proposal for CI/CD Observability conventions in OTel, only a different procedural path forward, full steam. See below for more context, and join the cicd-o11y channel on the CNCF slack workspace for more discourse.

@carlosalberto @horovits - just to provide an update on this. We discussed this OTEP on the SemConv meeting last week. We actually might not need to proceed directly with this OTEPs. Based on current direction, OTEPs are for SPEC changes, and right now our focus is the Semantic Conventions changes. We're reviewing the CDEvents work right now and coming up with a data model. Once we do that we'll be directly contributing to the Semantic Conventions through pull requests. If we need to make any future specification changes, we may leverage this OTEP or make new ones (that are smaller) to account for those changes.

But as of now, this OTEP isn't directly needed to be moved forward, just provide larger visibility to the efforts and context until there are spec changes. cc. @jsuereth