Open bobcatfish opened 5 years ago
Hello @bobcatfish Need your thoughts on this. 1- A service outside of tekton that watches tekton object and expose it to prometheus. 2- Introduce an endpoint in the tekton pipeline itself to expose all the metric to Prometheus.
My gut feeling is that I'd lean more toward exposing the metrics from Pipelines itself:
2- Introduce an endpoint in the tekton pipeline itself to expose all the metric to Prometheus.
Question: I'm not super familair with Prometheus, how vital would it be to making metrics usable? Could we simply emit the metrics, and allow the user to provide their own metrics gathering mechanism (which could be prometheus but could be something else), or would it make more sense for us to include Prometheus out of the box? (I've very sensitive to adding new dependencies, esp. since I'm under the impression that managing Prometheus is a job in itself, but maybe I'm wrong!)
Another option, which I think is a variation on your first suggestion @pradeepitm12 : 3 - (For now) only measure the performance in tests we write specifically for this purpose (i.e. we don't expose anything new for users of Tekton Pipelines, but we start doing our own measurements)
+1 we're looking at the same thing and just started looking at prometheus too, hopefully we can help each other out here.
:chart_with_upwards_trend:
+1 we're looking at the same thing and just started looking at prometheus too, hopefully we can help each other out here.
Maybe the first thing to do would be to identify the metrics we're interested in? I'm not super familiar with prometheus but I would think before we want to monitor the metrics, we'd want to figure out what needs monitoring (maybe there's a Jenkins/Jenkins X precedent we can draw on :D?)
We had our first meeting regarding observability, specifically metrics, today and work is now underway. There are a couple of other issues that overlap in theme with this one. I am linking them together here for us to review later and figure out which to keep and which to close.
Related issues: https://github.com/tektoncd/pipeline/issues/164 https://github.com/tektoncd/pipeline/issues/540 https://github.com/tektoncd/pipeline/issues/855
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Send feedback to tektoncd/plumbing.
We haven't worked on this lately but it is an item in our roadmap and I think we should keep it open.
/lifecycle frozen
I want to start gathering some requirements around this and get it moving :D
https://github.com/tektoncd/pipeline/issues/3521 has some use cases that we might be able use
@bobcatfish, any tektone performance white paper have? as so far, how many pipeline run or run we can support in middle cluster (just like: 1 master + 1 compute node.) the node spec: 8 core + 64 G memory + 250 G disk.
Expected Behavior
We should be measuring performance for Pipelines. This task includes both adding the actual measurement mechanism and also the design re. what exactly we want to measurement.
Some ideas for measurement:
Requirements
Actual Behavior
We do not measure or track this.
Additional Info