Azure / azure-monitor-baseline-alerts

Azure Monitor Baseline Alerts
MIT License
148 stars 212 forks source link

[Question/Feedback]: Inconsistency in `Microsoft.Network/azureFirewalls` #46

Closed heoelri closed 6 months ago

heoelri commented 10 months ago

Check for previous/existing GitHub issues

Description

SNAT Port Utilization in https://github.com/Azure/azure-monitor-baseline-alerts/blob/a0757e378f33f72020bd60ecbe8fbd32d9da7806/services/Network/azureFirewalls/alerts.yaml#L59-L60

has on one hand a threshold of

https://github.com/Azure/azure-monitor-baseline-alerts/blob/a0757e378f33f72020bd60ecbe8fbd32d9da7806/services/Network/azureFirewalls/alerts.yaml#L73-L74

and a few lines further down below in https://github.com/Azure/azure-monitor-baseline-alerts/blob/a0757e378f33f72020bd60ecbe8fbd32d9da7806/services/Network/azureFirewalls/alerts.yaml#L108-L109

is has a different threshold and operator

https://github.com/Azure/azure-monitor-baseline-alerts/blob/a0757e378f33f72020bd60ecbe8fbd32d9da7806/services/Network/azureFirewalls/alerts.yaml#L122-L125

I'd like to understand why that's the case. Is this an inconsistency or is there a reason for that?

arjenhuitema commented 10 months ago

Hi @heoelri thanks for raising this issue. I looks like it's is an inconsistency in the documentation. The parameter value from the code is correct (see below).

Threshold: 80

      "threshold": {
        "type": "String",
        "metadata": {
          "displayName": "Threshold",
          "description": "Threshold for the alert"
        },
        "defaultValue": "80" 

Operator: GreaterThan

                          {
                            "name": "SNATPortUtilization",
                            "metricNamespace": "Microsoft.Network/azureFirewalls",
                            "metricName": "SNATPortUtilization",
                            "operator": "GreaterThan",
                            "threshold": "[[parameters('threshold')]",
                            "timeAggregation": "Average",
                            "criterionType": "StaticThresholdCriterion"
                          }

@bzabber can you review the alerts.yaml ?

bzabber commented 10 months ago

@heoelri and @arjenhuitema

This is an interesting issue. If you look on line 63 the visible attribute is set to "true" and on line65 it has a tag value of "alz" which means the AMBA-ALZ team has vetted it and it's part of the solution. The other entry is what was autogenerated by AMBA and is based on what Azure customers that have created this alert tend to set the alert threshold at, in this case 99. You're able to determine this by looking at lines 112 which has the visible attribute to "false" and lines 113 to 115 "auto-generated" and has an "agc-" identifier associated with it which shows how common that alert is amongst other customers.

I'll work with the AMBA team to reconcile this so that we don't have a duplicate appearing but it is interesting to see what other customers set the metric at and what we've determined as a good value too.

arjenhuitema commented 6 months ago

The inconsistency in alerts.yaml with the operator has been fixed in #140.