etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.6k stars 9.74k forks source link

Document how to debug performance of prow job #18707

Open serathius opened 3 days ago

serathius commented 3 days ago

What would you like to be added?

In robustness tests we noticed a series of flakes and we wanted to eliminate performance as a factor. @ah8ad3 has reached out to Kubernetes prow and found interesting dashboards: https://monitoring-gke.prow.k8s.io/d/96Q8oOOZk/builds?orgId=1&refresh=30s&var-org=etcd-io&var-repo=etcd&var-job=ci-etcd-robustness-main-amd64&var-build=All&from=now-7d&to=now

We should preserve this knowledge and document for future how etcd contributors can navigate performance of prow jobs. Possibly a new section in etcd contributor documentation about prow that either links to the dashboard or some prow documentation.

Why is this needed?

Make it easier for others to debug etcd job performance issues.

serathius commented 3 days ago

cc @ah8ad3

serathius commented 3 days ago

cc @jmhbnz @ivanvc

ivanvc commented 3 days ago

I agree. There's also an EKS dashboard, but they are not documented anywhere in the infra testing repo, I also found about it by asking on Slack.