grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.45k stars 216 forks source link

Running alloy for one scrape target resulting in no metrics. #1871

Open st-akorotkov opened 1 month ago

st-akorotkov commented 1 month ago

What's wrong?

I'm trying to run Grafana Alloy for 3-rd party service. Replacing existing Grafana Agent with it. Grafana Agent scrapes target fine but Alloy altough discovering the ServiceMonitor is not sending metrics to Mimir

Steps to reproduce

Replace Agent with Alloy.

System information

No response

Software version

Grafana Alloy 1.4.1 Grafana Agent 0.33.1

Configuration

MetricsInstance for Agent

kind: MetricsInstance
metadata:
  name: exteranl
  namespace: monitoring
  labels:
    agent: grafana-agent-external
spec:
  remoteWrite:
    - url: http://<mimir-url>/api/v1/push
      headers:
        X-Scope-OrgID: <tenant>
      writeRelabelConfigs:
        - sourceLabels:
            - __address__
          targetLabel: k8s_cluster
          replacement: <tenant>
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      grafana-agent: external
  podMonitorNamespaceSelector: {}
  podMonitorSelector:
    matchLabels:
      grafana-agent: external
  probeNamespaceSelector: {}
  probeSelector:
    matchLabels:
      grafana-agent: external

Alloy values for helm chart

controller:
  type: statefulset
  replicas: 1
alloy:
  extraEnv:
    - name: KUBE_POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
  configMap:
    content: |
      prometheus.remote_write "primary" {
        external_labels = {
          "k8s_cluster" = "<tenant>",
          "cluster"     = "<tenant>",
          "__replica__" = env("KUBE_POD_NAME"),
        }
        endpoint {
          url = "http://<mimir-url>/api/v1/push"
          headers = {
            "X-Scope-OrgId" = "<tenant>",
          }
          queue_config {
            capacity = 200
            max_shards = 2
          }
        }
      }

      prometheus.operator.podmonitors "primary" {
        forward_to = [prometheus.remote_write.primary.receiver]
        selector {
          match_expression {
            key = "grafana-agent"
            operator = "In"
            values = ["external"]
          }
        }
      }

          prometheus.operator.servicemonitors "primary" {
            forward_to = [prometheus.remote_write.primary.receiver]
            selector {
              match_expression {
                key = "grafana-agent"
                operator = "In"
                values = ["external"]
              }
            }
          }
    crds:
      create: false
    serviceMonitor:
      enabled: true

Logs

No response

ptodev commented 1 month ago

Hi, thank you for submitting a bug report! Are there any errors or warnings in the logs?

st-akorotkov commented 1 month ago

No at all. All components are green in UI.

github-actions[bot] commented 1 week ago

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

st-akorotkov commented 1 week ago

Had same issue with Grafana agent recently. Switching to new Mimir tenant show that metrics are collected and received. Switching tenant back somehow fixed the issue. Look like to me that this is either some issue with underlaying Prometheus code or Mimir HA deduplication.

ptodev commented 1 week ago

Would you be interested in using clustering instead of the Prometheus-style HA setup?