elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
28 stars 444 forks source link

[elasticsearch] use `add_host_metadata` processor for logs #9363

Open klacabane opened 8 months ago

klacabane commented 8 months ago

Summary

Elasticsearch log streams use the es node identifier to populate the host.id field (audit logs example). While this creates inconsistency across the elasticsearch streams (metrics use the underlying machine identifier), it can also lead to false positives in the security's Agent Spoofing Detection Rules as the detection relies on host.id field being stable.

We should update this logic and align host.id across all elasticsearch data streams. The straightforward solution is to add the add_host_metadata processor in corresponding log.yml.hbs files.

Note that similar change should be introduced to the filebeat's config (audit example)

### Tasks
- [ ] Update elasticsearch package logs configuration
- [ ] Update elasticsearch filebeat module logs configuration
rdrgporto commented 1 month ago

Hi,

We are eager to see this issue addressed and would like to know if there are any plans to implement this feature.

Regards

andrewkroh commented 1 month ago

The straightforward solution is to add the add_host_metadata processor in corresponding log.yml.hbs files.

Under Elastic Agent, Filebeat always includes add_host_metadata in the configuration. So it's already running. The only way to disable it is to set tags: [forwarded] or to populate any host (source ref) field on the Agent side (I don't see any of this occurring in the config). I do see overrides happening in the Ingest pipeline for host.id and host.name, so I think the ingest pipeline needs changed to retain the original values from the add_host_metadata processor.