cdevents / spec

A common specification for Continuous Delivery events
Apache License 2.0
129 stars 22 forks source link

Bundling/bulk operations/event/something for more intensive events #152

Open xibz opened 1 year ago

xibz commented 1 year ago

Currently there are events that may produce a large number of subsequent events when they don't need to. For example, a recent deployment I tested against contained over 500 artifacts. That would be an event for each artifact.

This poses 2 problems:

  1. The first being the amount of load this would create on the client/servers to send 500 events for a single deployment.
  2. Things currently are very siloed, but certain events could be grouped together, e.g. test suites and tests cases, deployment and artifacts, etc...

Im still not sure if this is a spec issue or an engineering issue on the event bus/collector. However, the collector only really solves number 1. It could solve 2, but it could get a little more complicated there. So maybe a separate bundling event or something of that nature seems appropriate.

e-backmark-ericsson commented 1 year ago

I would like to know a bit more about that scenario where you had 500 "artifacts" in one deployment. It could come down to how we define the term "artifact". One artifact isn't necessarily equal to one binary, at least not in my mind. One artifact is an entity which has a specific identity and version. It could potentially be a collection of many binaries. It's finally up to the producer of the artifact event to decide what types of (levels of) artifacts it wants to announce through the event. If there is a logical entity that has a unique identity and version, but includes 500 binaries, I think it is ok that just one artifact.packaged event is sent for the whole thingy. Unless the producer wants the event consumers to know about each individual 500 binaries.

xibz commented 1 year ago

500 "artifacts" in one deployment

This was one example, but based on your replies that opens up far more questions, rather than somehow supporting this. I think rather than opening up with the questions first, let me first ask this.

I mentioned also, test suites and tests cases, where there can be many test cases and you want to send that one by one rather than bulked?

e-backmark-ericsson commented 1 year ago

I believe that a pipeline step that includes some test activities should always send testSuiteRun events, to signal that certain types of tests are/have been run. E.g. unit tests, integration tests, system tests, performance tests, etc. It shouldn't normally be too many test suites run in a single pipeline step I believe.

For test cases it could be slightly different. In the case of unit tests it could be several hundred test cases run in a single pipeline step / test activity. Therefore it might not be relevant to send individual testCaseRun events for each such unit test. But for longer running tests it could be relevant to send testCaseRun events for each executed test case.

So, it's up to the producer to decide whether individual testCaseRun event should be sent or not, mostly depending on the estimated duration of that test case execution I'd say, and of the amount of test cases run.

xibz commented 1 year ago

it's up to the producer to decide

Exactly. So why not allow for bulking/bundling if they want to send all test cases?

e-backmark-ericsson commented 1 year ago

Maybe it could be allowed.

We want the protocol to define one way, and one way only, to notify about an occurrence and its details. It makes it a lot harder for the consumers if the same information can be propagated in multiple ways and using multiple different event types.

Do you have a syntax proposal on how such a bundling could be done?