google / exposure-notifications-verification-server

Verification component for COVID-19 Exposure Notifications.
Apache License 2.0
233 stars 83 forks source link

Metrics used in dashboards should be declared in the Terraform config #760

Closed yegle closed 4 years ago

yegle commented 4 years ago

Up until now all of our metrics are automatically created when we first export a data point for that metric, by the Stackdriver exporter.

This leads to multiple cases when:

  1. we use a new metric in the dashboards
  2. the metric is exported by a new version of a Cloud Run service
  3. terraform apply the config failed to create the dashboard, preventing the Cloud Run service to be deployed.

Adding a dependency between the dashboard and Cloud Run service doesn't solve the problem:

  1. The alerting/monitoring related Terraform config is in a separate Terraform module, as it's considered an optional part of the overall Terraform config.
  2. Even if we can add the dependency: the metric would be created when the first data point for that metric is produced by the Cloud Run service, which can take a while if the code that produces the data point is never hit.

We should fix this issue by also manage MetricDescriptor resources in Terraform, at least those that are referenced by the dashboard resources.

yegle commented 4 years ago

https://www.terraform.io/docs/providers/google/r/monitoring_metric_descriptor.html for reference.

icco commented 4 years ago

/kind bug