Open zmoog opened 2 weeks ago
With one input + routing, we can reduce the user errors metric to zero and make the fewest storage account API calls possible.
Here's the diagram to leverage routing:
logs-azure.eventhub-default
data stream.logs-azure.eventhub-default
data stream contains a logs-azure.eventhub@custom
custom pipeline with rules to route log events based on the log category.If the routing rules cover all incoming log categories, the logs-azure.eventhub-default
data stream will be empty. However, we can set up an alarm rule to trigger a notification in case any log event doesn't have a routing rule so that we can iterate and update the logs-azure.eventhub@custom
custom pipeline.
The routing option is probably the most efficient method.
Here's the source code of the logs-azure.eventhub@custom
pipeline I am testing:
PUT _ingest/pipeline/logs-azure.eventhub@custom
{
"processors": [
{
"json": {
"field": "message",
"target_field": "tmp_json"
}
},
{
"set": {
"field": "routing_category",
"copy_from": "tmp_json.category",
"ignore_empty_value": true
}
},
{
"remove": {
"field": "tmp_json",
"ignore_missing": true
}
},
{
"reroute": {
"dataset": [
"azure.signinlogs"
],
"if": "ctx.routing_category == \"SignInLogs\" || ctx.routing_category == \"NonInteractiveUserSignInLogs\" || ctx.routing_category == \"ServicePrincipalSignInLogs\" || ctx.routing_category == \"ManagedIdentitySignInLogs\""
}
},
{
"reroute": {
"dataset": [
"azure.identity_protection"
],
"if": "ctx.routing_category == \"RiskyUsers\" || ctx.routing_category == \"UserRiskEvents\""
}
},
{
"reroute": {
"dataset": [
"azure.provisioning"
],
"if": "ctx.routing_category == \"ProvisioningLogs\""
}
},
{
"reroute": {
"dataset": [
"azure.auditlogs"
],
"if": "ctx.routing_category == \"AuditLogs\""
}
},
{
"reroute": {
"dataset": [
"azure.activitylogs"
],
"if": "ctx.routing_category == \"Administrative\" || ctx.routing_category == \"Security\" || ctx.routing_category == \"ServiceHealth\" || ctx.routing_category == \"Alert\" || ctx.routing_category == \"Recommendation\" || ctx.routing_category == \"Policy\" || ctx.routing_category == \"Autoscale\" || ctx.routing_category == \"ResourceHealth\""
}
},
{
"reroute": {
"dataset": [
"azure.graphactivitylogs"
],
"if": "ctx.routing_category == \"MicrosoftGraphActivityLogs\""
}
},
{
"reroute": {
"dataset": [
"azure.firewall_logs"
],
"if": "ctx.routing_category == \"AzureFirewallApplicationRule\" || ctx.routing_category == \"AzureFirewallNetworkRule\" || ctx.routing_category == \"AzureFirewallDnsProxy\" || ctx.routing_category == \"AZFWApplicationRule\" || ctx.routing_category == \"AZFWNetworkRule\" || ctx.routing_category == \"AZFWNatRule\" || ctx.routing_category == \"AZFWDnsQuery\""
}
},
{
"reroute": {
"dataset": [
"azure.application_gateway"
],
"if": "ctx.routing_category == \"ApplicationGatewayFirewallLog\" || ctx.routing_category == \"ApplicationGatewayAccessLog\""
}
}
]
}
That seems clean to me.
FYI : Your second diagram is showing as missing.
FYI : Your second diagram is showing as missing.
Ouch, I probably copied and pasted an expiring URL from GitHub. Checking!
It should be fixed now.
How does this model work if you wanted more than 1 agent for redundancy and improved performance?
How does this model work if you wanted more than 1 agent for redundancy and improved performance?
Good question! I should update the note to add this detail.
Here is a diagram showing how the two inputs work together to achieve improved redundancy and performance.
Users set up diagnostic settings, sending data to an event hub (1). The two (or more) inputs start and claim an equal part of partitions. With a four-partition event hub, two inputs usually get two partitions each. Each input processes messages and sends them to the data stream in Elasticsearch.
The routing (2) happens on Elasticsearch at the data stream level, so it works with one or multiple event hubs.
This sounds great. Unfortunately the graphic won't load for me.
I can zoom in here, looks awesome!
I can zoom in here, looks awesome!
Yeah, the GitHub images URL expires quickly. I usually reload the page and click on the image to get the whole picture. Let me know if you have difficulties in opening it.
Works great now.
Situation
The Azure Logs integration allows multiple log categories to be collected from a single event hub.
At a high level, users (1) define the event hub name and settings, and (2) the integration will use the same event hub for all the integrations.
Problem
This setup is inefficient, and we plan to change it in future releases.
Solutions
You can only use the generic integration and route logs to the right data stream using the reroute processor.