elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
109 stars 4.93k forks source link

[8.x](backport #41585) filebeat: input v2 compat uses random ID for CheckConfig #41641

Closed mergify[bot] closed 1 week ago

mergify[bot] commented 1 week ago

Proposed commit message

filebeat: input v2 compat uses random ID for CheckConfig

The CheckConfig function validates a configuration by creating and immediately discarding an input. However, a potential conflict arises when CheckConfig is used with autodiscover in Kubernetes.

Autodiscover accumulates configuration changes and applies them in batches. This can be problematic if a stop event for a pod is closely followed by a start event for the same pod (e.g., during a pod restart) before the inputs are reloaded. In this scenario, autodiscover might attempt to validate the configuration for the start event while the input for the pod is already running. This would lead to filestream input manager to see two inputs with the same ID, triggering a log warning.

Although this situation generates a warning, it doesn't result in data duplication. As the second input is only created to validate the configuration and later discarded. Also the reload process will ensure only new inputs are created, any input already running won't be duplicated.

Checklist

Disruptive User Impact

How to test this PR locally

Run a kind cluster:

k8s.yaml ```yaml apiVersion: v1 kind: ServiceAccount metadata: name: filebeat namespace: kube-system labels: k8s-app: filebeat --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: filebeat labels: k8s-app: filebeat rules: - apiGroups: [""] # "" indicates the core API group resources: - namespaces - pods - nodes verbs: - get - watch - list - apiGroups: ["apps"] resources: - replicasets verbs: ["get", "list", "watch"] - apiGroups: ["batch"] resources: - jobs verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: filebeat # should be the namespace where filebeat is running namespace: kube-system labels: k8s-app: filebeat rules: - apiGroups: - coordination.k8s.io resources: - leases verbs: ["get", "create", "update"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: filebeat-kubeadm-config namespace: kube-system labels: k8s-app: filebeat rules: - apiGroups: [""] resources: - configmaps resourceNames: - kubeadm-config verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: filebeat subjects: - kind: ServiceAccount name: filebeat namespace: kube-system roleRef: kind: ClusterRole name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: filebeat namespace: kube-system subjects: - kind: ServiceAccount name: filebeat namespace: kube-system roleRef: kind: Role name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: filebeat-kubeadm-config namespace: kube-system subjects: - kind: ServiceAccount name: filebeat namespace: kube-system roleRef: kind: Role name: filebeat-kubeadm-config apiGroup: rbac.authorization.k8s.io --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-config namespace: kube-system labels: k8s-app: filebeat data: filebeat.yml: |- filebeat.autodiscover: providers: - type: kubernetes node: ${NODE_NAME} hints.enabled: true hints.default_config: type: filestream prospector.scanner.symlinks: true id: filestream-kubernetes-pod-${data.kubernetes.container.id} take_over: true paths: - /var/log/containers/*${data.kubernetes.container.id}.log parsers: - container: ~ processors: - add_cloud_metadata: - add_host_metadata: logging: level: debug output.elasticsearch: hosts: ["https://localhost:9200"] # you can let it failing, no need for a real ES protocol: "https" allow_older_versions: true # username: "elastic" # password: "changeme" --- apiVersion: apps/v1 kind: DaemonSet metadata: name: filebeat namespace: kube-system labels: k8s-app: filebeat spec: selector: matchLabels: k8s-app: filebeat template: metadata: labels: k8s-app: filebeat spec: serviceAccountName: filebeat terminationGracePeriodSeconds: 30 hostNetwork: true dnsPolicy: ClusterFirstWithHostNet containers: - name: filebeat image: docker.elastic.co/beats/filebeat:9.0.0-SNAPSHOT args: [ "-c", "/etc/filebeat.yml", "-e", ] env: - name: ELASTICSEARCH_HOST value: elasticsearch - name: ELASTICSEARCH_PORT value: "9200" - name: ELASTICSEARCH_USERNAME value: elastic - name: ELASTICSEARCH_PASSWORD value: changeme - name: ELASTIC_CLOUD_ID value: "cliud-id" - name: ELASTIC_CLOUD_AUTH value: "aa:bb" - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName securityContext: runAsUser: 0 # If using Red Hat OpenShift uncomment this: #privileged: true resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: data mountPath: /usr/share/filebeat/data - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: varlog mountPath: /var/log readOnly: true volumes: - name: config configMap: defaultMode: 0640 name: filebeat-config - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: varlog hostPath: path: /var/log # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart - name: data hostPath: # When filebeat runs as non-root user, this directory needs to be writable by group (g+w). path: /var/lib/filebeat-data type: DirectoryOrCreate ```

{"log.level":"error","@timestamp":"2024-11-08T14:14:53.021Z","log.logger":"input","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create","file.name":"input-logfile/manager.go","file.line":174},"message":"filestream input with ID 'filestream-kubernetes-pod-3f834ee8c5d20f465e3a75433c9bd58a9968e154c32c92d1ef1948797820a64a' already exists, this will lead to data duplication, please use a different ID. Metrics collection has been disabled on this input.","service.name":"filebeat","ecs.version":"1.6.0"}


- repeat the process with the current filebeat release, you'll see the warning

## Related issues

- Closes #31767

## Use cases

filebeat on kubernetes using sutodiscover
<hr>This is an automatic backport of pull request #41585 done by [Mergify](https://mergify.com).
elasticmachine commented 1 week ago

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

mergify[bot] commented 1 week ago

This pull request has not been merged yet. Could you please review and merge it @AndersonQ? 🙏

AndersonQ commented 1 week ago

buildkite test this

AndersonQ commented 1 week ago

/test