cdevents / spec

A common specification for Continuous Delivery events
Apache License 2.0
129 stars 22 forks source link

Introduce links to CDEvents #139

Closed xibz closed 6 months ago

xibz commented 1 year ago

Changes

This commits introduces the concept of linking events to CDEvents. We take inspiration from Eiffel for certain components, and also outline new APIs and use cases.

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

afrittoli commented 1 year ago

@xibz please let us know when the PR is ready for review!

e-backmark-ericsson commented 1 year ago

The use of "parent", "span" and "global_id" here makes me think that we should probably take a deeper look into how we could include OpenTelemetry trace contexts into CDEvents. That way we would make use of an existing well-proven concept to describe spans and parent/child relationships rather than inventing our own slightly different. I can't say if that would be enough to solve the need for spans and global ids described in this PR, but I believe it would be sufficient to solve the use case of declaring within what pipeline or task a certain event occurred, buy stating its parent to an identifier available in the parent's event (pipelineRun or taskRun). OTel also has the concept of "Span Links", that can be used to describe causal relationships between spans, which could be pipelineRun's or taskRun's in the CDEvents context for example.

For the explicit event-to-event linking, I believe we could reuse something similar to what is in the Eiffel "links" object today. Something like what is shown here: https://hackmd.io/-Or6hobHSLWVj4duAWX7nA#artifactpublished

xibz commented 1 year ago

It looks unnecessarily complex to me, from a producer point of view. Why would the event producer also need to call the Link Service explicitly? Couldn't the Link Service just listen to the Event Bus and build up the links from the events seen there? I assume that all information needed in #5 is already part of the event sent in #4, or?

@e-backmark-ericsson Unless you have a pre-defined notion of what is a span, we'd need some way in the CDEvent spec to let the links service know it's a new span. And this is how tracing works, and while it may seem more complicated, it comes with a lot of benefits. Services know how to link think. So if they require some customization on how they want to connect things, they can absolutely do it here rather than rely on the links service. Further link tagging would need to be introduced to the CDEvents spec instead of the links spec, and honestly I don't like the marriage between the two like that. I feel like the links spec, which is apart of the links service, is starting to leak into the CDEvents spec at that point. The services just have a lot more context and will have a better idea than the links service on how things are connected and what should be connected.

And mind you this is only one architecture. You could absolutely do it your way where it is service instead of client based.

xibz commented 1 year ago

please don't overwrite the old commits when doing new ones. It becomes very time consuming to review the changes when there is nothing to compare the updates towards. We will anyway squash the commits when merging the PR eventually, so there should be no problem having many commits on the PR.

That was my mistake that I didn't catch until I pushed it up

e-backmark-ericsson commented 1 year ago

It looks unnecessarily complex to me, from a producer point of view. Why would the event producer also need to call the Link Service explicitly? Couldn't the Link Service just listen to the Event Bus and build up the links from the events seen there? I assume that all information needed in #5 is already part of the event sent in #4, or?

@e-backmark-ericsson Unless you have a pre-defined notion of what is a span, we'd need some way in the CDEvent spec to let the links service know it's a new span. And this is how tracing works, and while it may seem more complicated, it comes with a lot of benefits. Services know how to link think. So if they require some customization on how they want to connect things, they can absolutely do it here rather than rely on the links service. Further link tagging would need to be introduced to the CDEvents spec instead of the links spec, and honestly I don't like the marriage between the two like that. I feel like the links spec, which is apart of the links service, is starting to leak into the CDEvents spec at that point. The services just have a lot more context and will have a better idea than the links service on how things are connected and what should be connected.

And mind you this is only one architecture. You could absolutely do it your way where it is service instead of client based.

Yeah, now I get it somehow. Yes. We need to describe the concepts of spans a bit more I believe. When a new span could be initiated and when it could end. Some different scenarios would be good to describe for that, even though we cannot provide a comprehensive list of course. I believe there are many event producers what would never create nor finish a span, and they should not need to call an external service when sending their events I believe. And for the producers that do create or finish spans there could be multiple options, as I guess you're referring to in your comment there.

Some such options would be:

  1. Require an external Links Service to be in the picture if spans are to be used in the events (like your current proposal)
  2. Add additional fields to the CDEvents spec describing that a span is created or finished, and let any interested Links Service listen to the event bus instead of expecting explicit calls
  3. Add dedicated events to declare that a span is created or finished
  4. and more?

What option we should go for should be decided in a coming CDEvents meeting. My initial gut feeling is that we should go for a solution that doesn't require the event producers to call an external service, but sending the event(s) should be enough.

xibz commented 1 year ago

What option we should go for should be decided in a coming CDEvents meeting

That sounds perfect.

e-backmark-ericsson commented 11 months ago

The use of "parent", "span" and "global_id" here makes me think that we should probably take a deeper look into how we could include OpenTelemetry trace contexts into CDEvents. That way we would make use of an existing well-proven concept to describe spans and parent/child relationships rather than inventing our own slightly different. I can't say if that would be enough to solve the need for spans and global ids described in this PR, but I believe it would be sufficient to solve the use case of declaring within what pipeline or task a certain event occurred, buy stating its parent to an identifier available in the parent's event (pipelineRun or taskRun). OTel also has the concept of "Span Links", that can be used to describe causal relationships between spans, which could be pipelineRun's or taskRun's in the CDEvents context for example.

For the explicit event-to-event linking, I believe we could reuse something similar to what is in the Eiffel "links" object today. Something like what is shown here: https://hackmd.io/-Or6hobHSLWVj4duAWX7nA#artifactpublished

The OTel part of this comment might not be relevant anymore. since the PR has changed quite dramatically lately. I'm not fully convinced that the current chain_id could not similar to a span id in OTel, but I'll rest my case on that since I don't have enough insight in the OTel concepts.

The interlinking of events mentioned in this comment is discussed in other comments already, so this comment can be closed as outdated now.

e-backmark-ericsson commented 11 months ago

@xibz , what about relations between different chains, do you consider that to be in scope of this PR?

For example when a deployment activity is triggered by the creation of a certain artifact being built and published, but when that deployment activity also includes other artifacts that where built in previous chains. The deployment event would then have relations/link to multiple artifacts where not all where built as part of the same chain.

Or when a build is triggered by a certain source change with a new chain_id, but that build also brings in other source changes from other repos. The build event would then be related/linked to multiple source changes that where notified with different chain_ids.

As I see it such a relation would be easily described by creating a link between two events that have different chain_id's. If you agree I think it would be good to describe that scenario in this PR as well.

xibz commented 10 months ago

@e-backmark-ericsson ugh sorry for all the mistakes with the schema. Too much copy pasta.

One thing that isn't clear is the best way to have examples around embedded links. I feel like having some with embedded links while others do not may make it confusing. I wonder if we can instead have something like embedded_link_* for some of the events to make it clear on how to do that. The one issue with that is there are now multiple spots that would need to be updated if the event that has multiple examples for a given event type.

afrittoli commented 10 months ago

@xibz would you like to update the PR description, since this is not anymore very WIP? 🙏 Thank you!

e-backmark-ericsson commented 10 months ago

One thing that isn't clear is the best way to have examples around embedded links. I feel like having some with embedded links while others do not may make it confusing. I wonder if we can instead have something like embedded_link_* for some of the events to make it clear on how to do that. The one issue with that is there are now multiple spots that would need to be updated if the event that has multiple examples for a given event type.

Yeah, I see your point. I believe it would be good if the examples folder contained both the simples/shortest possible versions of each event, and also one or two variants with some optional parameters/objects added. I would propose that we add a folder for each event type in the examples folder, and move the json examples in there. With that we could easily have multiple variants of each event type with different optional parameters added for different needs. Each event folder could then include a "minimal.json" representing the mandatory parameters, and then for example "embedded_links.json" for some of the events, where links are added to them, and other json files for other optional parameters, so we cover all possible parameters with explicit examples.

What do you say @afrittoli , does it sound like a good idea? I know that the example validation script would need to be updated, but I think it would be very good to be able to show how all optional parameters are used in explicit examples.

afrittoli commented 10 months ago

I looked at the "validate examples" failures:

Additionally, if we extended validation to the new subfolders, the N_<subject>.<predicate>.json would also break the current logic that expects names like <subject>.<predicate>.json to extract subject and predicate so that it may discover the local schema.

This shouldn't be too hard to fix, but I wonder if we should instead have a link in every single one of the events since that's part of their schema and keep the links subfolder as it is today, but only for documentation purposes, not for testing.