hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.51k stars 4.59k forks source link

Additional fields for azurerm_monitor_activity_log_alert "Service Health" alerts #2996

Closed lewinski closed 3 years ago

lewinski commented 5 years ago

Community Note

Description

It isn't possible to create useful "Service Health" type alerts using Terraform because you cannot provide affected services and regions to be monitored in the activity log. These are needed to replicate the alert that is created by going through the Azure Portal. If these are not provided, your service health alert generates alerts for all services in all regions which is probably not interesting to alert on, since I would guess that most people are only using a subset of services/regions in their applications.

New or Affected Resource(s)

Potential Terraform Configuration

resource "azurerm_monitor_activity_log_alert" "service_health" {
  name                = "example-servicehealthalert"
  resource_group_name = "${azurerm_resource_group.main.name}"
  scopes              = ["${data.azurerm_subscription.current.id}"]
  description         = "Health alerts for specific services and regions."

  criteria {
    category = "Service Health"

    service_health_events = ["Incident", "Maintenance"]

    service_health_regions = [
      "Global",
      "West Europe",
      "East US"
    ]

    service_health_services = [
      "Backup",
      "Load Balancer",
      "Network Infrastructure",
      "Virtual Machines",
      "Virtual Network"
    ]
  }

  action {
    //...
  }
}

service_health_events is an optional array of strings. There are 4 possible values for the strings: Incident, Maintenance, Informational, and ActionRequired. In the Azure Portal, these are represented by three choices in the selection menu: Service issue (Incident), Planned maintenance (Maintenance), and Health advisories (Informational and ActionRequired). When the value is omitted, all types of service health events will be alerted upon by omitting this criteria from the API request.

service_health_regions is an optional array of strings. Each string should be the display name of an Azure location (examples: "East US" or "Australia Central 2") or "Global", which is used for some location-less services. When this value is omitted, all regions of service health events will be alerted upon.

service_health_services is an optional array of strings. Each string should be the display name of an Azure service (examples: "Azure Database for MySQL" or "Key Vault"). It is not obvious where to get this list other than the Azure portal itself, where there are currently 148 services. When this value is omitted, all services will be alerted upon.

Here is some example JSON criteria for the above:

    "condition": {
        "allOf": [
            {
                "field": "category",
                "equals": "ServiceHealth",
                "containsAny": null
            },
            {
                "anyOf": [
                    {
                        "field": "properties.incidentType",
                        "equals": "Incident",
                        "containsAny": null
                    },
                    {
                        "field": "properties.incidentType",
                        "equals": "Maintenance",
                        "containsAny": null
                    }
                ]
            },
            {
                "field": "properties.impactedServices[*].ImpactedRegions[*].RegionName",
                "equals": null,
                "containsAny": [
                    "East US",
                    "West Europe",
                    "Global"
                ]
            },
            {
                "field": "properties.impactedServices[*].ServiceName",
                "equals": null,
                "containsAny": [
                    "Backup",
                    "Load Balancer",
                    "Network Infrastructure",
                    "Virtual Machines",
                    "Virtual Network"
                ]
            }
        ]
    }

References

lewinski commented 5 years ago

I'm happy to work on this functionality but I want to make sure that the schema makes sense before putting in any more time on it. I also would appreciate any recommendations on whether to attempt validation of the region and service lists and how to go about that if it is be recommended.

This does also kind of beg the question of having the ability to do generic "properties.*" alert queries but that seems like a more complicated design to get correct.

rohrerb commented 5 years ago

+1 - Would be great if we had the additional fields.

lewinski commented 5 years ago

I looked at this more today but it doesn't really seem practical until the SDK has support for these fields. I opened up an issue in the SDK repository to track the need.

MMalikKhan commented 5 years ago

Hi @tombuildsstuff @lewinski , off topic. I am exploring to create Log Signal alerts for application insights logs with a query. I could not find a sample to see if log alert can be done with queries ? Could you please let me know if alerts can be created for app insights logs with specific query ? If yes, then how this can be done.. much appreciate your reply .

ghost commented 5 years ago

Hi, any expectations for the availability of this feature?

mariojacobo commented 5 years ago

I couldn't even get this far, for some reason it's not recognizing the fields. I'm using the exact template that OP posted. Any ideas ? I'm using Terraform v0.12.4

Error: Unsupported argument

  on alerts.tf line 31, in resource "azurerm_monitor_activity_log_alert" "service_health":
  31:     service_health_services = [

An argument named "service_health_services" is not expected here.
cholmes12 commented 5 years ago

This feature would be great.

suman5488 commented 4 years ago

Hi , Please let me know anyone when this issue will be resolved. I am unable to add these conditions (specific regions, specific services) in the log alerts

dhaugli commented 4 years ago

This feature would be great, its like theres almost no point of even having the servicehealth category in the log alert if these features are lacking.

turbut commented 4 years ago

Looking forward to this feature. At least adding region would be beneficial.

Yanunita commented 4 years ago

yes!! it is very important :)

rapster83 commented 4 years ago

Any updates on that? To add the additional attributes like "Regions", etc. would be great.

al-lac commented 4 years ago

Really need this soon. Adding the region alone would be enough for my use case for now.

elsesiy commented 4 years ago

@tombuildsstuff Is anybody actively working on this and do you have an ETA? Right now we have to manually configure those alerts, shell out and use azure cli or similar hacks which is less than ideal.

MaxiPalle commented 4 years ago

Any update on this? Maybe it's already fixed, but haven't yet figured it out ...

nu11modem commented 3 years ago

I also have the need to create complex ResourceHealth alerts. Has there been any progress on this?

julioas09 commented 3 years ago

This is indeed still not solved. The workaround I can think of with terraform is sending Activity Log diagnostics to a Log Analytics workspace and then setting up an alert based on a LA query.

k7faq commented 3 years ago

When will this issue get attention? over 18 months with no action. Why can we not have focused alerting? When you deal with environments of multiple products, hundreds of servers spread across multiple regions with many PaaS offerings incorporated this present alert becomes DISTRACTINGLY ANNOYING WITH ALERTS.

Microsoft staff please seek focus to enhance this. I don't understand why the conditional statements leveraged by the ARM template cannot be leveraged here. Is Azure not using focused APIs to support requests regardless of platform/service? I get the suspicion that MSFT is creating unique APIs for ARM, unique APIs for PowerShell, unique APIs for Azure CLI, unique APIs for portal.azure.com. If so, I am failing to see the benefit. Why could a unified API platform not serve all of the interfaces?

rogerm-chen commented 3 years ago

hi, if we just do criteria { category = "ServiceHealth" } I see from azure portal, no region is ticked, surprise it didn't fail.

so if no region is selected, does that mean it applies to all regions?

mbfrahry commented 3 years ago

Hey all. This is now fixed with #10978 and should make it into the 2.56.0 release!

ghost commented 3 years ago

This has been released in version 2.56.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.56.0"
}
# ... other configuration ...
ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error πŸ€– πŸ™‰ , please reach out to my human friends πŸ‘‰ hashibot-feedback@hashicorp.com. Thanks!