lucian-ioan / public-notes

Apache License 2.0
0 stars 0 forks source link

[Azure] Presentation skeleton for sanitization #3

Open lucian-ioan opened 1 year ago

lucian-ioan commented 1 year ago

Overview of issue

Some logs from Azure can have issues with newlines while others with single quotes.

A method had to be created to internally fix this in beats. Otherwise this issue could lead to failure in processing the data. (ex: the pipeline receiving one document with two records).

Malformed logs journey in the Elastic stack

1) Examples of malformed JSONs

1) Live demo with how a malformed log looks like in Kibana

2) Deep dive into beats code failure where the malformed logs end up from azureeventhub.

Challenge: Spin up the stack using a custom version of beats

  1. Why was creating a custom beats needed to properly test sanitization E2E?

  2. Set up the custom agent live with filebeat modifications: https://github.com/zmoog/public-notes/issues/35

  3. Talk about challenges/limitations while setting it up.

Code dive: Sanitization implementation in beats using Go

  1. Code walkthrough and logic used explained

  2. Tradeoff in complexity vs addressing as many malformation cases explained with different approaches: https://go.dev/play/p/wCNCM7-QM9A

  3. Elastic agent sample configuration for integrations (explain how everything works together).

E2E testing

  1. Brief overview of eh library by Maurizio https://github.com/zmoog/eventhubs

  2. Full E2E testing of sanitization using different types of malformed logs sent via eh and an Azure integration

Sanitization UI in Kibana

  1. Discuss implementation and code walkthrough

  2. Live test using an Azure integration and eh