logstash-plugins / logstash-output-elasticsearch

https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
Apache License 2.0
219 stars 305 forks source link

Use integration metadata to create ES actions #1155

Closed andsel closed 11 months ago

andsel commented 11 months ago

Release notes

Use integration metadata to interact with Elasticsearch.

What does this PR do?

Change the creation of actions that are passed down to Elasticsearch to use also the metadata fields set by an integration. The interested fields are id (document_id), index and pipeline, the field values are taken verbatim without placeholders resolution. The index, document_id and pipeline that are configured in the plugin settings have precedence on the integration ones because manifest an explicit choice made by the user.

Why is it important/What is the impact to the user?

This PR fixes an interoperability issue with Agent's integrations, where some metadata valued by integration has to be used down to Elasticsearch.

Checklist

Author's Checklist

How to test this PR locally

filter { elastic_integration { cloud_id => "" cloud_auth => "elastic:" geoip_database_directory => "//vendor/bundle/jruby/3.1.0/gems/logstash-filter-geoip-7.2.13-java/vendor/GeoLite2-City.mmdb" } }

output { stdout { codec => rubydebug { metadata => true } }

elasticsearch { cloud_id => "" api_key => "" data_stream => true ssl => true } }

- now use a sample data event (like [this](https://docs.elastic.co/integrations/m365_defender#incident)) and create a one-line json file (named `/tmp/defender_singleline.json`). To squash all lines in one use:
```sh
cat <file_in>.json | awk '{for(i=1;i<=NF;i++) printf "%s",$i}' > <file_out>.json

or use the file defender_singleline.json

This means that despite the 2 distinct runs, the Defender integration that generate an unique id from the Incident fields was correctly executed and used. The proof can be done by executing the same flow above, with shipped ES output plugin, and verify that the document result duplicated, so no unique document_id is generated by the integration.

Related issues

Use cases

Screenshots

Logs

andsel commented 11 months ago

Hi @yaauie, thank's a lot for your review. I've integrated your suggestion and the PR is ready for a second round of review 🙏