Pyrra causes instability of Prometheus

We do see some errors in the Pyrra pods.


2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.603386975Z stderr F 2024-04-17T21:44:32Z    ERROR   Reconciler error    {"controller": "servicelevelobjective", "controllerGroup": "pyrra.dev", "controllerKind": "ServiceLevelObjective", "ServiceLevelObjective": {"name":"inmusicprofile-authorised-devices","namespace":"monitoring"}, "namespace": "monitoring", "name": "inmusicprofile-authorised-devices", "reconcileID": "71068fe1-4d2e-4c25-a4c0-68569c4f60c3", "error": "failed to update prometheus rule: prometheusrules.monitoring.coreos.com \"inmusicprofile-authorised-devices\" is invalid: metadata.resourceVersion: Invalid value: 0x0: must be specified for an update"} |  
-- | -- | -- | --
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.586588632Z stderr F level=info ts=2024-04-17T21:44:32.584682542Z caller=servicelevelobjective.go:89 controllers=ServiceLevelObjective reconciler=servicelevelobjective namespace=monitoring/inmusicprofile-authorised-devices msg="updating prometheus rule" namespace= name= |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.486439391Z stderr F level=info ts=2024-04-17T21:44:32.486105827Z caller=servicelevelobjective.go:78 controllers=ServiceLevelObjective reconciler=servicelevelobjective namespace=monitoring/inmusicprofile-authorised-devices msg="creating prometheus rule" namespace= name= |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.3001328Z stderr F level=info ts=2024-04-17T21:44:32.298281629Z caller=servicelevelobjective.go:89 controllers=ServiceLevelObjective reconciler=servicelevelobjective namespace=monitoring/inmusicprofile-device-auth-rest-api msg="updating prometheus rule" namespace=monitoring name=inmusicprofile-device-auth-rest-api |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.251899805Z stderr F level=info ts=2024-04-17T21:44:32.251727212Z caller=servicelevelobjective.go:89 controllers=ServiceLevelObjective reconciler=servicelevelobjective namespace=monitoring/inmusicprofile-device-auth-rest-api msg="updating prometheus rule" namespace=monitoring name=inmusicprofile-device-auth-rest-api |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.246897311Z stderr F     sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:227 |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.246892669Z stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.246887735Z stderr F     sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:266 |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.24688296Z stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.246877871Z stderr F     sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:329 |  
  |   | 2024-04-18 09:44:32 | ip-10-0-35-81.ec2.internalpyrra-56f8db5b5-tpdcc | 2024-04-17T21:44:32.246870407Z stderr F sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler

The SLO is defined like so

apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  name: inmusicprofile-authorised-devices
  namespace: monitoring
  labels:
    prometheus: k8s
    role: alert-rules
    pyrra.dev/team: webservices
    pyrra.dev/ns: inmusicprofile
    pyrra.dev/service: AuthorisedDevicesService
    pyrra.dev/tier: "4"
spec:
  target: "99"
  window: 4w
  description: AuthorisedDevicesService public endpoints.
  indicator:
    ratio:
      errors:
        metric: traces_spanmetrics_latency_count{span_name=~"inmusicapi\\.v1\\.AuthorisedDevicesService\\/.*", status_code="STATUS_CODE_ERROR"}
      total:
        metric: traces_spanmetrics_latency_count{span_name=~"inmusicapi\\.v1\\.AuthorisedDevicesService\\/.*"}

pyrra-dev / pyrra

Pyrra causes instability of Prometheus #1149