matanolabs / matano

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
https://matano.dev
Apache License 2.0
1.44k stars 97 forks source link

🔥 Realtime Data Enrichment - add get_enrichment_table_record fn to VRL log transform pipeline #111

Closed shaeqahmed closed 1 year ago

shaeqahmed commented 1 year ago

Summary

This pull request introduces support for real-time data enrichment in Matano during ingest, addressing #99 and #21. The new get_enrichment_table_record function has been added to the VRL log transform pipeline, enabling retrieval of enrichment data and adding it to the incoming data stream in real-time, before the detection / lake writing steps.

For many use cases, this feature means users no longer need to perform manual JOINS in their queries or do manual lookups in their detection rules and improves downstream analytics performance by providing pre-joined/enriched records in the data lake and detection engine.

Screenshot 2023-03-07 at 11 42 46 PM

Up next

Next step, will be to add extend support to GeoIP enrichment tables (MaxMind), which will require special handling logic.