sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.12k stars 1.29k forks source link

dev/ci: Grafana dashboard for pipeline and job metrics #26118

Closed bobheadxi closed 2 years ago

bobheadxi commented 3 years ago

Tracking issue for:

Build an overview to report on metrics such as number of builds per day/week/month, ratio of successful to failed builds, build run time over day/week/month (maybe broken down by step or command, e.g. with tracing)

See #25768: https://docs.google.com/document/d/1fknr3NQGmwbKCfnF3Bcr-tYzV-TeuAGq-_Tm2-z_09M/edit#heading=h.4lpcsjmzh2db

bobheadxi commented 3 years ago

https://sourcegraph.grafana.net/d/iBBWbxFnk/ci?orgId=1 is a dashboard currently based on the logs entries exported by https://github.com/sourcegraph/sourcegraph/issues/25562

image

It's fairly rudimentary for now, and I'm not sure if we can generate all the metrics originally scoped out in this task purely from logs

bobheadxi commented 3 years ago

Idea to enable runtime metrics + success ratios: https://github.com/sourcegraph/sourcegraph/issues/27982

bobheadxi commented 2 years ago

Marking this as closed now that we have https://sourcegraph.grafana.net/d/iBBWbxFnk/ci?orgId=1

Elyx0 commented 2 years ago

@bobheadxi How did you go around with it, trying to solve https://twitter.com/0xngmi/status/1525544425866289153 ?

bobheadxi commented 2 years ago

@Elyx0 we developed some tooling to ingest logs uploaded from CI in Loki:

Then, we wrote some dashboards based on queries over the logs, for example:

count by (build, name) (count_over_time({app="buildkite",branch="main",state="failed"}[5m]))
count by (name) (count_over_time({app="buildkite",state="failed",branch="main"}[3h]))
count(
topk(1, count_over_time({app="buildkite",state="failed",branch="main"}[1d])) by (build)
)
count by (name) (count_over_time({app="buildkite",branch="main",state="failed"}[7d]))

We also mention this briefly in our newsletter: https://handbook.sourcegraph.com/departments/product-engineering/engineering/enablement/dev-experience/newsletter/#nov-23-2021

What it looks like today:

image image