Azure / azure-monitor-baseline-alerts

Azure Monitor Baseline Alerts
MIT License
135 stars 200 forks source link

[Bug]: Deploy VNetG ExpressRoute CPU Utilization Alert remediation fails #278

Open Greg-Court opened 1 month ago

Greg-Court commented 1 month ago

Check for previous/existing GitHub issues

Description

In the initiative "Deploy Azure Monitor Baseline Alerts for Connectivity", all resources remediate successfully apart from "Deploy VNetG ExpressRoute CPU Utilization Alert", details below:

image

image

Related events:

Deployment creation for policy definition '/providers/Microsoft.Management/managementGroups/es/providers/Microsoft.Authorization/policyDefinitions/Deploy_VnetGw_ExpressRouteCpuUtil_Alert' and assignment '/providers/Microsoft.Management/managementGroups/es-connectivity/providers/Microsoft.Authorization/policyAssignments/Deploy-AMBA-Connectivity' and deployment '/subscriptions/19093fe6-adc6-4d4a-a0b8-58cdcaacdf37/resourceGroups/lz-rg-hub-con-uks-01/providers/Microsoft.Resources/deployments/PolicyDeployment_1915490346730452944' was unsuccessful.

Compliance reason (details)

Compliance state Non-compliant

Last evaluated 15/07/2024, 10:15:06 BST

Definition version (preview) 1.0.0

Initiative version (preview) 1.0.0

Non-compliance message Alerting must be deployed to Azure services.

Reason for non-compliance

No related resources match the effect details in the policy definition. Existence condition

Type Microsoft.Insights/metricAlerts

Last evaluated resource (out of 15) /subscriptions/19093fe6-adc6-4d4a-a0b8-58cdcaacdf37/resourcegroups/lz-rg-hub-con-uks-01/providers/Microsoft.Insights/metricAlerts/lz-vnet-hub-con-uks-01-DDOSAttackAlert

Reason for non-compliance Current value must be equal to the target value.

Field Microsoft.Insights/metricAlerts/criteria.Microsoft-Azure-Monitor-SingleResourceMultipleMetricCriteria.allOf[*].metricNamespace

Path properties.criteria.allOf[*].metricNamespace

Current value "Microsoft.Network/virtualNetworks"

Target value "Microsoft.Network/virtualNetworkGateways"

Brunoga-MS commented 1 month ago

Hello @Greg-Court , thanks for your issue. We will investigate the issue and let you know.

@arjenhuitema , could you please take a look?

Thanks, Bruno.

arjenhuitema commented 1 month ago

Hi @Greg-Court,

Thanks for reporting this issue. I’m looking into it and will get back to you.

Noticed your post on the ALZ repo; just to double-check, did you set it up using the ALZ portal accelerator?

Greg-Court commented 1 month ago

Hey, thanks for looking into this, the landing zone itself was deployed via the terraform caf enterprise scale module, and the azure monitor baseline alerts were deployed via the CLI using the following command:

az deployment mg create --name "amba-GeneralDeployment" --template-uri  https://raw.githubusercontent.com/Greg-Court/azure-monitor-baseline-alerts/main/patterns/alz/alzArm.json --location "uksouth" --management-group-id "_redacted_" --parameters ./alzArm.param.json

A fork of the latest version of the AMBA repo was used.

Please don't hesitate to ask more questions, happy to provide any information that might help resolve the issue.

Greg-Court commented 1 month ago

The ExpressRoute Gateway was deployed via the caf terraform enterprise scale module, using the following config:

            virtual_network_gateway = {
              enabled = true
              config = {
                address_prefix           = "10.94.0.64/26"
                gateway_sku_expressroute = "Standard"
                gateway_sku_vpn          = "VpnGw1AZ"
              }
              tags = var.default_tags
            }
arjenhuitema commented 1 month ago

Hi @Greg-Court,

Found the issue and the fix is now in our dev branch. Please see: Deploy-VNETG-ERGCPUUtilization-Alert.json

The root of the issue was that the severity parameter wasn't fully specified in the deployment settings.

I'll post an update when we plan to merge these updates into the main branch.