Open belimawr opened 5 months ago
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
So the loop we have is something like:
At step 2 is there some obvious marker we can add to this log message so that in step 3 Filebeat can drop this log line, or at least not recursively log about it? Whatever marker we choose has to still work if the Filebeat pod is restarted, that is it cannot be ephemeral.
We could match the Publish event:
string in the logs as part of a drop processor configuration. Ideally that should be restricted to Filebeat's pods, so it would be good to match on pod/daemonset name or something like that. I believe the ideal scenario is use the dynamic templates to get Filebeat's pod ID and filter out the on it. I'm just not sure how feasible that is (I haven't looked into that either).
However we cannot guarantee user's won't change it.
When Filebeat reads its own logs while collecting Kuberenetes logs, it reads as plain text, so no fields from the JSON are available for filtering.
Isn't this true also for Elastic Agents collecting its own logs via Collect Logs & Metrics?
Isn't this true also for Elastic Agents collecting its own logs via Collect Logs & Metrics?
No, for Elastic-Agent we deploy a separate Filebeat (usually called filestream-monitoring) that collects the Elastic-Agent logs and all other components (it's a single log file), but drops it's own logs, so we don't have this problem.
When using Kubernetes autodiscover, the example configuration from our docs will make Filebeat collect it's own logs. This is fine most of the time.
However if Filebeat is logging in debug level and the
processors
selector is enabled, this will cause Filebeat to log every event it ingested causing a loop of ever growing log entries:We should at least add a warning in our documentation about it and provide an example of how to prevent Filebeat from collecting its own logs.