grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.39k stars 203 forks source link

Alloy is not de-duplicating targets #1418

Closed nis-thac closed 1 month ago

nis-thac commented 2 months ago

What's wrong?

As reported my me in https://github.com/grafana/k8s-monitoring-helm/issues/645. It was suggested that Alloy should be de-duplicating the targets.

Steps to reproduce

1) Deploy the k8s-monitoring chart. 2) Add scrape annotations to a pod with multiple ports exposed. 3) The metrics for that pod will be scraped as many times per minute as there are ports. 5 ports = 5 scrapes per minute. Expected would be 1 scrape per minute no matter how many ports. 4) See the Grafana Alloy UI. The exact same targets will be present multiple times.

System information

No response

Software version

Grafana Alloy 1.2.0

Configuration

cluster:
  name: redacted
externalServices:
  prometheus:
    host: https://redacted.grafana.net
    basicAuth:
      username: "redacted"
      password: "redacted"
  loki:
    host: https://redacted.grafana.net
    basicAuth:
      username: "redacted"
      password: "redacted"
  tempo:
    host: https://redacted.grafana.net:443
    basicAuth:
      username: "redacted"
      password: "redacted"
metrics:
  enabled: true
  # this has cost implications
  # https://grafana.com/docs/grafana-cloud/cost-management-and-billing/reduce-costs/metrics-costs/adjust-data-points-per-minute/
  scrapeInterval: "60s"
  cost:
    enabled: true
  node-exporter:
    enabled: true
  api-server:
    enabled: true
  autoDiscover:
    # https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/k8s-monitoring#metrics-job-auto-discovery
    annotations:
      #instance: ""
      #job: ""
      metricsPath: "prometheus.io/path"
      #metricsPortName: ""
      metricsPortNumber: "prometheus.io/port"
      #metricsScheme: ""
      metricsScrapeInterval: "prometheus.io/scrape-interval"
      scrape: "prometheus.io/scrape"
    extraRelabelingRules: |
      rule {
        source_labels = ["__meta_kubernetes_namespace"]
        action = "replace"
        target_label = "namespace"
      }

logs:
  enabled: true
  pod_logs:
    enabled: true
  cluster_events:
    enabled: true
traces:
  enabled: false
receivers:
  deployGrafanaAgentService: false
  grpc:
    enabled: true
  http:
    enabled: true
  zipkin:
    enabled: false
  grafanaCloudMetrics:
    enabled: true
opencost:
  enabled: true
  opencost:
    exporter:
      defaultClusterId: redacted
    prometheus:
      external:
        url: https://redacted.grafana.net/api/prom
kube-state-metrics:
  enabled: true
prometheus-node-exporter:
  enabled: true
prometheus-operator-crds:
  enabled: true
configValidator:
  enabled: false
alloy:
  alloy:
    clustering:
      enabled: true
alloy-events: {}
alloy-logs: {}
extraConfig: ""

Logs

No response

venkatb-zelar commented 2 months ago

Hi @nis-thac any update on this please?

nis-thac commented 2 months ago

@venkatb-zelar See this helpful comment on my issue in grafana/k8s-monitoring-helm.

venkatb-zelar commented 2 months ago

@nis-thac i guess something is fundamentally wrong with my configuration. When we run alloy in daemonset mode, should we enable clustering? Else alloy jobs will be duplicated across nodes?

nis-thac commented 2 months ago

@venkatb-zelar Sorry, I have no experience with daemonset mode. I don't think clustering is part of the problem tho.

davordbetter commented 1 month ago

I came across same issue. Pods with multiple ports are scanned 4 times per minut (4 ports exposed) despite having same url for prometheus metrics (address replaced).

petewall commented 1 month ago

This was an issue in the k8s-monitoring Helm chart. Please upgrade to release 1.5.2 and try again.