cloudflare / pint

Prometheus rule linter/validator
https://cloudflare.github.io/pint/
Apache License 2.0
869 stars 53 forks source link

Support for PrometheusRule contained in larger YAML manifest #746

Closed felipesere closed 7 months ago

felipesere commented 1 year ago

We use helm to produce a single, large manifest.yaml for each application that contains many YAML docs. When running pint against it, my expectation is that it would find the right documents with kind: PrometheusRule and then apply the lints there. At the moment, it just silently fails. If I just extract the PrometheusRules from that manifest.yaml and run them individually pint correctly finds issues.

Here is our pint config:

parser {
  relaxed = ["(.*)", "yaml/parse"]
}
checks {
  disabled = ["promql/fragile"] # some upstream kubernetes-mixins use without and it's their problem
}
# truelayer invariants
rule {
  match {
    kind = "alerting"
  }
  # all alerts must have a standard severity
  label "severity" {
    required = true
    value    = "error|critical|warning|info"
  }
  # all alerts should have owners as a label
  label "owners" {
    required = true
  }
}
prymitive commented 1 year ago

Unless you can provide a sample file there's not much we can do here.

felipesere commented 8 months ago

👋 Circling back to this. I have a small file that I easily reproduces the issue I face:

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deployment-notifier
  namespace: tools
spec:
  podSelector:
    matchLabels:
      app: deployment-notifier
  policyTypes:
  - Ingress
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: opentelemetry-operator-system
    ports:
    - port: 1
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: deployment-notifier
  namespace: tools
spec:
  groups:
  - name: ./deployment-notifier.rules
    rules:
    - alert: DeploymentNotifierSwarmiaProviderErrors
      annotations:
        summary: Something something...
      expr: 1 > 0
      for: 10m
      labels:
        severity: not-that-critical

Above yaml is a snippet of what my helm chart produces. If I keep the order as is I get zero lint failures with the config from my previous comment. If I change the order of the documents to have the PromRule first, I get the expected lint error.

prymitive commented 8 months ago

That's because current yaml parser doesn't correctly read all documents from a multi-document file. If you have a file with:

---
some:
  yaml
---
some more:
  yaml

then only first some:yaml document is read, that's a bug and I hope to fix it one day but don't have much time for it at the moment.

prymitive commented 7 months ago

Fixed in #894