elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.14k stars 4.91k forks source link

Filebeat 8.12.0 does not collect kubernetes.deployment.name field of Kubernetes Pods logs #37684

Closed RomanIzvozchikov closed 4 months ago

RomanIzvozchikov commented 7 months ago

Hi all!

Problem: Filebeat 8.12.0 does not collect kubernetes.deployment.name field of logs while it collects for example kubernetes.statefulset.name, kubernetes.daemonset.name. So, in general Kubernetes Pods logs are collected. The only problem is with absence of kubernetes.deployment.name field for pods that are controlled by Deployment (to be more precise for pods that are controlled by ReplicaSets that are controlled by Deployments). I noticed that kubernetes.deployment.name is collected correctly by Filebeat 7.17.16 with the same configuration. I don't have any errors in Filebeat logs 8.12.0 as well as 7.17.16.

Usage scenario: We use Elastic Cloud in Kubernetes + Filebeat to collect logs from Kubernetes Pods. Our Kubernetes cluster is AWS EKS cluster.

Versions: ElasticSearch: 8.11.3 Filebeat: 8.12.0

Operating System: We use docker.elastic.co/beats/filebeat:8.12.0 image.

Filebeat config:

filebeat.inputs:
  - type: container
    paths:
      - /var/log/containers/*.log
    processors:
      - add_kubernetes_metadata:
          host: ${NODE_NAME}
          matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
      - drop_event:
          when.or:
            - equals.kubernetes.namespace: "ingress-nginx"
            - equals.kubernetes.namespace: "monitoring"
            - equals.kubernetes.deployment.name: "cluster-autoscaler"
            - equals.kubernetes.deployment.name: "vault-secrets"
      - drop_fields:
          fields: [
            "container.id",
            "container.runtime",
            "input.type",
            "kubernetes.container.image",
            "kubernetes.labels.pod-template-hash",
            "kubernetes.namespace_labels.kubernetes_io/metadata_name",
            "kubernetes.namespace_uid",
            "kubernetes.node.hostname",
            "kubernetes.node.labels.arch",
            "kubernetes.node.labels.beta_kubernetes_io/arch",
            "kubernetes.node.labels.beta_kubernetes_io/instance-type",
            "kubernetes.node.labels.beta_kubernetes_io/os",
            "kubernetes.node.labels.eks_amazonaws_com/capacityType",
            "kubernetes.node.labels.eks_amazonaws_com/nodegroup",
            "kubernetes.node.labels.eks_amazonaws_com/nodegroup-image",
            "kubernetes.node.labels.eks_amazonaws_com/sourceLaunchTemplateId",
            "kubernetes.node.labels.eks_amazonaws_com/sourceLaunchTemplateVersion",
            "kubernetes.node.labels.failure-domain_beta_kubernetes_io/region",
            "kubernetes.node.labels.failure-domain_beta_kubernetes_io/zone",
            "kubernetes.node.labels.instanceBillingType",
            "kubernetes.node.labels.k8s_io/cloud-provider-aws",
            "kubernetes.node.labels.kubernetes_io/arch",
            "kubernetes.node.labels.kubernetes_io/hostname",
            "kubernetes.node.labels.kubernetes_io/os",
            "kubernetes.node.labels.node_kubernetes_io/instance-type",
            "kubernetes.node.labels.topology_ebs_csi_aws_com/zone",
            "kubernetes.node.labels.topology_kubernetes_io/zone",
            "kubernetes.node.labels.usageType",
            "kubernetes.node.name",
            "kubernetes.node.uid",
            "kubernetes.pod.ip",
            "kubernetes.pod.uid",
            "kubernetes.replicaset.name",
            "log.file.path",
            "log.offset",
          ]
          ignore_missing: true

processors:
  - add_fields:
      target: ''
      fields:
        env: 'dev'

  - decode_json_fields:
      fields: [ "message"]
      process_array: false
      max_depth: 1
      target: "logs"
      overwrite_keys: false
      add_error_key: false

output.elasticsearch:
  hosts: [ '${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}' ]
  username: ${ELASTICSEARCH_USERNAME}
  password: ${ELASTICSEARCH_PASSWORD}
  protocol: https

setup:
  kibana:
    host: '${KIBANA_HOST:elasticsearch}:${KIBANA_PORT:9200}'
    username: ${ELASTICSEARCH_USERNAME}
    password: ${ELASTICSEARCH_PASSWORD}

  template.settings:
    index.number_of_replicas: 0

Filebeat cluster role:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
  - verbs:
      - get
      - watch
      - list
    apiGroups:
      - ''
    resources:
      - namespaces
      - pods
      - nodes
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - apps
    resources:
      - replicasets

Example stdout output with Filebeat 8.12.0 (I replaced any sesitive information with placeholder):

{
  "@timestamp": "2024-01-22T10:26:44.429Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.12.0"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "env": "dev",
  "logs": {
    "!BADKEY": 0,
    "<SENSITIVE_INFORMATION>": 601883,
    "time": "2024-01-22T10:26:44.429523284Z",
    "level": "DEBUG",
    "msg": "<SENSITIVE_INFORMATION>",
    "version": "<SENSITIVE_INFORMATION>",
    "revision": "<SENSITIVE_INFORMATION>"
  },
  "stream": "stdout",
  "message": "<SENSITIVE_INFORMATION>",
  "input": {},
  "container": {
    "image": {
      "name": "<SENSITIVE_INFORMATION>"
    }
  },
  "log": {
    "file": {}
  },
  "kubernetes": {
    "container": {
      "name": "<SENSITIVE_INFORMATION>"
    },
    "node": {
      "labels": {
        "topology_kubernetes_io/region": "eu-west-1"
      }
    },
    "pod": {
      "name": "<SENSITIVE_INFORMATION>"
    },
    "namespace": "<SENSITIVE_INFORMATION>",
    "namespace_labels": {},
    "replicaset": {},
    "labels": {
      "io_kompose_service": "<SENSITIVE_INFORMATION>"
    }
  },
  "host": {
    "name": "<SENSITIVE_INFORMATION>"
  },
  "agent": {
    "ephemeral_id": "<SENSITIVE_INFORMATION>",
    "id": "<SENSITIVE_INFORMATION>",
    "name": "<SENSITIVE_INFORMATION>",
    "type": "filebeat",
    "version": "8.12.0"
  }
}

Example stdout output with Filebeat 8.12.0 (I replaced any sesitive information with placeholder):

{
  "@timestamp": "2024-01-22T10:24:02.676Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.16"
  },
  "stream": "stdout",
  "message": "<SENSITIVE_INFORMATION>",
  "host": {
    "name": "<SENSITIVE_INFORMATION>"
  },
  "logs": {
    "time": "2024-01-22T10:24:02.676572399Z",
    "level": "DEBUG",
    "msg": "<SENSITIVE_INFORMATION>",
    "version": "<SENSITIVE_INFORMATION>",
    "revision": "<SENSITIVE_INFORMATION>",
    "!BADKEY": 0,
    "<SENSITIVE_INFORMATION>": 601880
  },
  "env": "dev",
  "log": {
    "file": {}
  },
  "input": {},
  "container": {
    "image": {
      "name": "<SENSITIVE_INFORMATION>"
    }
  },
  "kubernetes": {
    "replicaset": {},
    "labels": {
      "io_kompose_service": "<SENSITIVE_INFORMATION>"
    },
    "node": {
      "labels": {
        "topology_kubernetes_io/region": "eu-west-1"
      }
    },
    "namespace": "<SENSITIVE_INFORMATION>",
    "namespace_labels": {},
    "container": {
      "name": "<SENSITIVE_INFORMATION>"
    },
    "deployment": {
      "name": "<SENSITIVE_INFORMATION>"
    },
    "pod": {
      "name": "<SENSITIVE_INFORMATION>"
    }
  },
  "agent": {
    "version": "7.17.16",
    "hostname": "<SENSITIVE_INFORMATION>",
    "ephemeral_id": "<SENSITIVE_INFORMATION>",
    "id": "<SENSITIVE_INFORMATION>",
    "name": "<SENSITIVE_INFORMATION>",
    "type": "filebeat"
  },
  "ecs": {
    "version": "1.12.0"
  }
}

We can observe that in Filebeat 7.17.16 there is kubernetes.deployment.name field. I added kubernetes.replicaset.name to my drop_fields configuration, but I have empty kubernetes.replicaset.name even without respective drop_fields configuration.

I hope that I added all required information to check this issue. I respect any mention.

Thank you for you attention!

botelastic[bot] commented 7 months ago

This issue doesn't have a Team:<team> label.

tetianakravchenko commented 7 months ago

Hey @RomanIzvozchikov

can you please try to add to add_kubernetes_metadata processor deployment: true configuration and check if it fixes the issue?

RomanIzvozchikov commented 7 months ago

Hi @tetianakravchenko,

Thank you for your reply! I have already tried to set deployment: true. It did not help me, because as I red in documentation deployment: true is default value.

VannTen commented 6 months ago

This also happens on filebeat 8.11.4: with the following config:

logging.level: debug
logging.selectors:
- processors
- publish
filebeat.autodiscover:
  providers:
  - type: kubernetes
    hints:
      enabled: true
      default_config:
        type: filestream
        id: kubernetes-container-logs-${data.kubernetes.pod.name}-${data.kubernetes.container.id}
        take_over: true
        enabled: false # This is for opt-in log collection with co.elastic.logs/enabled=true
        paths:
        - /var/log/containers/*-${data.kubernetes.container.id}.log  # CRI path
        parsers:
        - container: 
            stream: all
            format: auto
        prospector:
          scanner:
            symlinks: true
    add_ressource_metadata: # Considers namespace annotations for hints
      deployment: true
      namespace:
        include_annotations:
          - "nsannotations1"
processors:
- rename:
    ignore_missing: true
    fail_on_error: false
    fields:
    - from: kubernetes.labels.<REDACTED>
      to: <REDACTED>
    - from: kubernetes.labels.<REDACTED>
      to: <REDACTED>
    - from: kubernetes.labels.<REDACTED>
      to: <REDACTED>
    - from: kubernetes.namespace_labels.<REDACTED>
      to: <REDACTED>
    - from: kubernetes.namespace_labels.<REDACTED>
      to: <REDACTED>
    - from: kubernetes.namespace_labels.<REDACTED>
      to: <REDACTED>
- copy_fields:
    fields:
    - from: <REDACTED>
      to: <REDACTED>
    when:
      not:
        has_fields:
        - <REDACTED>
output:
  kafka:
    version: 2
    hosts:
  <REDACTED>

8.10.4 produce the following events (taken from filebeat log with debug + processors):

{
  "@timestamp": "2024-02-29T15:29:38.379Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.4"
  },
  "input": {
    "type": "filestream"
  },
  "agent": {
    "ephemeral_id": "6be1298f-361d-43a0-9931-aa8e3e5647cb",
    "id": "9306e44f-d4ab-45cf-8022-81a821ab841a",
    "name": "<CLUSTERNAME>-filebeat-filebeat-4qm8f",
    "type": "filebeat",
    "version": "8.10.4"
  },
  "host": {
    "name": "tpridak8s-filebeat-filebeat-4qm8f"
  },
  ... (redacted fields, added by processors from k8s labels)
  "log": {
    "file": {
      "path": "/var/log/containers/test-filebeat-55795d7b56-xdj5x_mgautier_alpine-7f5b519751e4a9bf1dc626bdbbb9c432cc52fb5e6279dd071b9ce1cde6d43745.log",
      "device_id": 64771,
      "inode": 2120385
    },
    "offset": 1505474
  },
  "stream": "stdout",
  "container": {
    "id": "7f5b519751e4a9bf1dc626bdbbb9c432cc52fb5e6279dd071b9ce1cde6d43745",
    "runtime": "containerd",
    "image": {
      "name": "alpine"
    }
  },
  "ecs": {
    "version": "8.0.0"
  },
  "message": "Thu Feb 29 15:29:38 UTC 2024",
  "kubernetes": {
    "pod": {
      "name": "test-filebeat-55795d7b56-xdj5x",
      "uid": "a6c6871a-fb63-48df-abe6-568588f6b050",
      "ip": "10.249.145.220"
    },
    "container": {
      "name": "alpine"
    },
    "namespace_labels": {
      "kubernetes_io/metadata_name": "mgautier"
    },
    "deployment": {
      "name": "test-filebeat"
    },
    "labels": {
      "app": "test-filebeat",
      "pod-template-hash": "55795d7b56"
    },
    "replicaset": {
      "name": "test-filebeat-55795d7b56"
    },
    "node": {
      "hostname": "<REDACTED>",
      "name": "<REDACTED>",
      "uid": "43e8b4fa-466d-459b-a170-7ab7e81ac768",
      "labels": {
        "kubernetes_io/hostname": "<REDACTED>",
        "kubernetes_io/os": "linux",
        "topology_kubernetes_io/zone": "pair",
        "beta_kubernetes_io/arch": "amd64",
        "beta_kubernetes_io/os": "linux",
        "kubernetes_io/arch": "amd64"
      }
    },
    "namespace_uid": "a6cb1cce-47b8-4bef-a18b-5461f815f626",
    "namespace": "mgautier"
  }
}

And 8.11.4

{
  "@timestamp": "2024-02-29T15:19:08.354Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.11.4"
  },
  "stream": "stdout",
  "message": "Thu Feb 29 15:19:08 UTC 2024",
  "input": {
    "type": "filestream"
  },
  "agent": {
    "ephemeral_id": "1e7c2f0e-8398-4521-a78d-5d5bc9204379",
    "id": "9306e44f-d4ab-45cf-8022-81a821ab841a",
    "name": "<CLUSTERNAME>-filebeat-filebeat-5r6td",
    "type": "filebeat",
    "version": "8.11.4"
  },
  "log": {
    "offset": 1500439,
    "file": {
      "device_id": "64771",
      "inode": "2120385",
      "path": "/var/log/containers/test-filebeat-55795d7b56-xdj5x_mgautier_alpine-7f5b519751e4a9bf1dc626bdbbb9c432cc52fb5e6279dd071b9ce1cde6d43745.log"
    }
  },
  "kubernetes": {
    "node": {
      "labels": {
        "beta_kubernetes_io/os": "linux",
        "kubernetes_io/arch": "amd64",
        "kubernetes_io/hostname": "<REDACTED>",
        "kubernetes_io/os": "linux",
        "topology_kubernetes_io/zone": "pair",
        "beta_kubernetes_io/arch": "amd64"
      },
      "hostname": "<REDACTED>",
      "name": "<REDACTED>",
      "uid": "43e8b4fa-466d-459b-a170-7ab7e81ac768"
    },
    "pod": {
      "uid": "a6c6871a-fb63-48df-abe6-568588f6b050",
      "ip": "10.249.145.220",
      "name": "test-filebeat-55795d7b56-xdj5x"
    },
    "namespace": "mgautier",
    "namespace_uid": "a6cb1cce-47b8-4bef-a18b-5461f815f626",
    "namespace_labels": {
      "kubernetes_io/metadata_name": "mgautier"
    },
    "replicaset": {
      "name": "test-filebeat-55795d7b56"
    },
    "labels": {
      "pod-template-hash": "55795d7b56",
      "app": "test-filebeat"
    },
    "container": {
      "name": "alpine"
    }
  },
  "container": {
    "id": "7f5b519751e4a9bf1dc626bdbbb9c432cc52fb5e6279dd071b9ce1cde6d43745",
    "runtime": "containerd",
    "image": {
      "name": "alpine"
    }
  },
  "host": {
    "name": "<CLUSTERNAME>-filebeat-filebeat-5r6td"
  },
  "ecs": {
    "version": "8.0.0"
  },
  // processed fields from labels (REDACTED)

}
RomanIzvozchikov commented 6 months ago

I had the same problem with Filebeat 8.11.4 before switching to 8.12.0. Probably, this problem exists in both these versions.

thechristschn commented 4 months ago

Deployment metadata enrichment is disabled by default since 8.11.0, see https://www.elastic.co/guide/en/beats/libbeat/current/release-notes-8.11.0.html

It was disabled because it might cause a memory leak.

But you can add the same functionality by using a ingest pipeline, as described here: https://github.com/elastic/integrations/issues/8176

Using a processor should also work (make sure to drop the replicaset name afterwards):

      - copy_fields:
          fields:
          - from: "kubernetes.replicaset.name"
            to: "kubernetes.deployment.name"
        when.has_fields: ['kubernetes.replicaset.name']
      - replace:
          fields:
          - field: "kubernetes.deployment.name"
            pattern: "(-[a-z0-9]+)$"
            replacement: ""
        when.has_fields: ['kubernetes.deployment.name']
RomanIzvozchikov commented 4 months ago

Thanks a lot @thechristschn for this useful information! I din't find this information in documentation during investigating the issue.

kipusoep commented 1 month ago

@thechristschn I used your processor idea (but had to slightly modify it to be valid - indentation of when.has_fields):

      - copy_fields:
          fields:
          - from: "kubernetes.replicaset.name"
            to: "kubernetes.deployment.name"
          when.has_fields: ['kubernetes.replicaset.name']
      - replace:
          fields:
          - field: "kubernetes.deployment.name"
            pattern: "(-[a-z0-9]+)$"
            replacement: ""
          when.has_fields: ['kubernetes.deployment.name']

But what do you mean with "make sure to drop the replicaset name afterwards"?