tektoncd / results

Long term storage of execution results.
Apache License 2.0
78 stars 74 forks source link

Record Kubernetes Events related to PipelineRuns/TaskRuns #439

Open adambkaplan opened 1 year ago

adambkaplan commented 1 year ago

Feature request

Extend Results to store Kubernetes events that are related to a TaskRun or PipelineRun.

Use case

See discussion in #362

manuelwallrapp commented 1 year ago

Here would be interesting to discuss how to store the events? Aside the logs in Json Format on an S3 Bucket or in the Database as a set of Records or a single Record as well?

manuelwallrapp commented 1 year ago

We would need this feature rather sooner than later. Either we dive into Go and Tekton Results (we are all Java Developers). Or we find somebody who implements it here for us.

adambkaplan commented 1 year ago

@dibyom @vdemeester would this feature warrant further discussion in a TEP? We did so for the logs feature, and this would be another potentially significant change to Results. Things to consider:

  1. Do we store the event as a Record associated with the TaskRun/PipelineRun result?
  2. How much of the dynamic reconciler code can we reuse?
  3. Will a reconciler that watches events break at scale? Is there an easy way to filter out events that only Tekton cares about?
  4. Do we provide an extra API endpoint via gRPC, as we did with logs?

Simply by writing this comment, I think the answer to my rhretorical question is "yes."

khrm commented 1 year ago

We would probably filter events based on what the Pipelinerun/Taskrun reconciler is doing.

By the way, there are other events during the lifecycle of Pipelineruns/Taskruns. PVC, pods.I am not sure whether they are useful.

adambkaplan commented 1 year ago

I can see PVC events being useful, especially for auditing. The scope of events here can be quite large - I definitely have concerns when it comes to scalability.

@manuelwallrapp since you have shown the most interest in this feature, do you think you can start drafting a TEP by following the guidelines here? You don't need to complete the entire TEP template - just the first few sections so we can understand the idea and merge it as proposed.

manuelwallrapp commented 1 year ago

@adambkaplan I will discuss this with our company but I think it makes sense that we will. We already stream and store events from PipelineRuns, TaskRuns and Pods in a local PVC as a stripped down Json. Only essential informations are stored. The amount of Data is relatively small compared to the Log Data or even PipelineRun and TaskRun Data. But we also strip the EventData down to the most important information.

@khrm nowadays we add namespace event watchers for PipelineRun, TaskRun and Pod events in our Java Tekton Controller Software and they are of great use, since sometimes you want to know why a TaskRun couldn't be started. After Tekton Results cleaned the Tekton Resources, the Events are gone. We usually keep the Tekton Resources only some hours. So for us is very important to also have the Events available for a longer period of time.

manuelwallrapp commented 11 months ago

Ok, took some delay, but I wrote a TEP and created the PR: https://github.com/tektoncd/community/pull/1118