Closed spiffxp closed 8 months ago
FYI @kubernetes/ci-signal if anyone is interested in this
/help I am willing to give someone appropriate credentials to develop this workflow, answer questions and review PR's
/area infra/monitoring
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale /lifecycle frozen Keeping this around as a good help-wanted issue, unless/until we decide on some other easily "git-ops-able" dashboard solution
The UI makes import/export much easier these days, and we now export dashboards as part of audit
So we still can't immediately recreate dashboards at the run of a script for DR purposes, but we're better off than we used to be
Could also try using terraform, as is being done here: https://github.com/GoogleCloudPlatform/oss-test-infra/tree/master/prow/oss/terraform
There is an intent to eventually move away from Grafana for monitoring.prow.k8s.io if it continues to run on google.com infra for too much longer, due to a license change. This would be a good opportunity to prototype
/milestone v1.23
There is an intent to eventually move away from Grafana for monitoring.prow.k8s.io if it continues to run on google.com infra for too much longer, due to a license change. This would be a good opportunity to prototype
Is the license change is about Grafana's switch to APLv3 ?
I am interested to work on this @spiffxp /assign
/remove-sig testing
/remove-help /assign At this point the only dashboards I'm aware of are in k8s-infra-prow-build, and I've got a PR out to update those via terraform now: https://github.com/kubernetes/k8s.io/pull/2938
/milestone clear
We are not doing this anymore. We have public monitoring dashboards for our build environments:
/close
@ameukam: Closing this issue.
The use case here is ensuring we don't lose dashboards. By backing them up into git, we can keep track of when/why they have changed, and restore them if they're deleted from google cloud monitoring.
I really wish the UI provided import/export options to make this easier. That said, it is possible via the API or
gcloud
: https://cloud.google.com/blog/products/management-tools/cloud-monitoring-dashboards-using-an-apiOne workflow could be:
gcloud monitoring dashboards describe
gcloud monitoring dashboards update
This workflow could also allow:
An alternative workflow is:
And finally, it may be possible to automate away the etag toil iff we are smart/consistent about when to avoid stomping overtop of unexpected changes.
/priority important-longterm /wg k8s-infra /sig testing since the dashboards I have in mind are for k8s-infra-prow-build
Specifically, I have these two dashboards in mind (must be member of k8s-infra-prow-viewers@kubernetes.io to view, feel free to PR yourself in if you would like)