grafana / agent

Vendor-neutral programmable observability pipelines.
https://grafana.com/docs/agent/
Apache License 2.0
1.6k stars 487 forks source link

kubernetes_sd_configs is not working #6175

Closed mvkrishna86 closed 10 months ago

mvkrishna86 commented 10 months ago

What's wrong?

kubernetes_sd_configs is not working, not able to get the pod logs with the usage of kubernetes_sd_configs.

Steps to reproduce

Below is the config:

    logs:
      positions_directory: /tmp/loki-positions 
      configs:
        - name: agent
          clients:
            - url: https://<LOKI_URL>/api/prom/push
              tenant_id: XXXXX
              external_labels:
                cluster: XXXXX
                region: ap-south-1
          scrape_configs:
            - job_name: kubernetes-pods-app
              kubernetes_sd_configs:
                - role: pod
              pipeline_stages:
                - docker: {}
              relabel_configs:
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_label_app
                  target_label: app
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_label_component
                  target_label: component
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_node_name
                  target_label: node
                - action: replace
                  source_labels:
                    - __meta_kubernetes_namespace
                  target_label: namespace
                - action: replace
                  replacement: $1
                  separator: /
                  source_labels:
                    - namespace
                    - app
                  target_label: job
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_name
                  target_label: pod
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_container_name
                  target_label: container
                - action: replace
                  replacement: /var/log/pods/$1/*.log
                  separator: /
                  source_labels:
                    - __meta_kubernetes_pod_uid
                    - __meta_kubernetes_pod_container_name
                  target_label: __path__

root@brane-uat-c22742-worker-4:/# curl localhost:12345/agent/api/v1/logs/instances {"status":"success","data":["agent"]}

root@brane-uat-c22742-worker-4:/# curl localhost:12345/agent/api/v1/logs/targets {"status":"success","data":[]}

System information

No response

Software version

Grafana Agent v0.36.1

Configuration

No response

Logs

No response

hainenber commented 10 months ago

Can you help showing the YAMLs for these following resources

My hunch is that there's a permission misconfiguration 😁

mvkrishna86 commented 10 months ago

AFAIK, those configs are fine. @hainenber . Below are the files.

Daemonset:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "9"
    reloader.stakater.com/auto: "true"
  creationTimestamp: "2024-01-16T13:53:38Z"
  generation: 9
  labels:
    app: grafana-agent
    app.kubernetes.io/instance: grafana-agent
  name: grafana-agent
  namespace: observability
  resourceVersion: "32677495"
  uid: a8f5f958-18a0-4878-a60f-c31f8b3d4a88
spec:
  minReadySeconds: 10
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: grafana-agent
      name: grafana-agent
  template:
    metadata:
      annotations:
        reloader.stakater.com/last-reloaded-from: '{"type":"CONFIGMAP","name":"grafana-agent-daemonset-configmap","namespace":"observability","hash":"83c124ffb41ec32d16e80499d4672270fb9dffd0","containerRefs":["grafana-agent"],"observedAt":1705495986}'
      creationTimestamp: null
      labels:
        app: grafana-agent
        name: grafana-agent
    spec:
      containers:
      - args:
        - -config.file=/etc/agent/daemonset-config.yaml
        command:
        - /bin/grafana-agent
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HTTP_PROXY
          value: http://XXXX:9999/
        - name: HTTPS_PROXY
          value: http://XXXX:9999/
        image: public.ecr.aws/XXXX/agent:v0.36.1
        imagePullPolicy: IfNotPresent
        name: grafana-agent
        ports:
        - containerPort: 8080
          hostPort: 8080
          name: http-metrics
          protocol: TCP
        - containerPort: 6831
          hostPort: 6831
          name: thrift-compact
          protocol: UDP
        - containerPort: 6832
          hostPort: 6832
          name: thrift-binary
          protocol: UDP
        - containerPort: 14268
          hostPort: 14268
          name: thrift-http
          protocol: TCP
        - containerPort: 14250
          hostPort: 14250
          name: thrift-grpc
          protocol: TCP
        - containerPort: 9411
          hostPort: 9411
          name: zipkin
          protocol: TCP
        - containerPort: 55680
          hostPort: 55680
          name: otlp
          protocol: TCP
        resources: {}
        securityContext:
          privileged: true
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/agent
          name: grafana-agent-daemonset-configmap
        - mountPath: /var/log
          name: varlog
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
        - mountPath: /etc/machine-id
          name: etcmachineid
          readOnly: true
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: grafana-agent
      serviceAccountName: grafana-agent
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        operator: Exists
      volumes:
      - configMap:
          defaultMode: 420
          name: grafana-agent-daemonset-configmap
        name: grafana-agent-daemonset-configmap
      - hostPath:
          path: /var/log
          type: ""
        name: varlog
      - hostPath:
          path: /var/lib/docker/containers
          type: ""
        name: varlibdockercontainers
      - hostPath:
          path: /etc/machine-id
          type: ""
        name: etcmachineid
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

ServiceAccount:

apiVersion: v1
kind: ServiceAccount
metadata:
  creationTimestamp: "2023-12-07T14:36:36Z"
  labels:
    app.kubernetes.io/instance: grafana-agent
  name: grafana-agent
  namespace: observability

ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/instance: grafana-agent
  name: grafana-agent
  resourceVersion: "1947912"
  uid: 8c4ac0f6-f15a-4725-b41b-1b7c1e818760
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs:
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  verbs:
  - get

ClusterRoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/instance: grafana-agent
  name: grafana-agent
  resourceVersion: "32105710"
  uid: 4960e7ab-cbb5-455c-813e-76d824595b61
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: grafana-agent
subjects:
- kind: ServiceAccount
  name: grafana-agent
  namespace: observability
mvkrishna86 commented 10 months ago

Got the issue, I have setup HTTP_PROXY for a different reason. Now its blocking this kubernetes.default.svc calls. I have added NO_PROXY and the problem is solved.