PagerDuty / terraform-provider-pagerduty

Terraform PagerDuty provider
https://www.terraform.io/docs/providers/pagerduty/
Mozilla Public License 2.0
204 stars 208 forks source link

Bug: Updating a service with alert_grouping.type="content_based" leads to an API error "Invalid Input Provided" #867

Closed dkarl-wgs closed 1 month ago

dkarl-wgs commented 2 months ago

Terraform Version

Terraform v1.8.2
on linux_amd64

Provider Version

provider "registry.terraform.io/pagerduty/pagerduty" {
  version     = "3.11.4"
  constraints = ">= 2.14.0, < 4.0.0"
  hashes = [
    "h1:3eywn/Q8UC84cVEPcCT3klxEMnjJoYrNfcCQaAcpoUI=",
    "zh:282e8fa8565996acb2fe56e1ec12abc5098b54042cb3997c3a82d986a05ffbb1",
    "zh:2d36a5626d7ecfdf30c0e377c4e52386eee6bdab3665943f5a801c6887393bdc",
    "zh:41254dffcf89db4ffe1e7ab47cf2eb87989bcee9fd7017b908024a0c39c6b090",
    "zh:4e8d885d03670aca7123a23835cfd7da95079e62fa52d48b08938a71b8d241ee",
    "zh:52e61d557440016fd432c578aeb2e05a7d54541b25a058839b167f4d7f9aa7c1",
    "zh:607550fc7fb65e23ad60e3a36693d4f46d8a0502095a6d9027a51e2b6c405a84",
    "zh:649bc743d11e511d789d72dd543ead68dc4cfd9345208d0588718ed06d3175ca",
    "zh:69c1a3ca78980bd4306ab8888e16720d61740679ba83a07c7361f7a157f1e4b6",
    "zh:7db62df36ccb5322e04751ac0e3588efe0295f96aad1208ccb6b33f682f5f004",
    "zh:97c8e29ff3d9c8b50c51334e7bf76311bd86c9cc1f82f45ae7ac2edcbe2ca20b",
    "zh:bba2e47b71fcb5ca7f431a75bb68f484fc3865acad77bbd08904aef64d25a5c6",
    "zh:bf706a4c5f39228a0571d7840c8e1f8720eabc312f3b3ea1be9ad3d5658f83d2",
    "zh:c0f9da8d2450b372436039e5545c2935af28e8775f15fb37f4f99f6ad97c225d",
  ]
}

Affected Resource(s)

Terraform Configuration Files

resource "pagerduty_service" "test_service" {
  name                    = "test-service"
  escalation_policy       = var.escalation_policy

  alert_grouping_parameters {
    type = "content_based"
    config {
      aggregate = "all"
      fields = ["custom_details.alert_name","custom_details.stage"]
      time_window = 300
    }
  }
}

Debug Output

Expected Behavior

When updating the alert_grouping settings of a service in Terraform, it should not result in an API error.

creating a service works just fine:

$ terraform -chdir=tf/ apply -auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # pagerduty_service.test_service will be created
  + resource "pagerduty_service" "test_service" {
      + acknowledgement_timeout = "1800"
      + alert_grouping          = (known after apply)
      + alert_grouping_timeout  = (known after apply)
      + auto_resolve_timeout    = "14400"
      + created_at              = (known after apply)
      + description             = "Managed by Terraform"
      + escalation_policy       = "PBTFT9Y"
      + html_url                = (known after apply)
      + id                      = (known after apply)
      + last_incident_timestamp = (known after apply)
      + name                    = "test-service"
      + response_play           = (known after apply)
      + status                  = (known after apply)
      + type                    = (known after apply)

      + alert_grouping_parameters {
          + type = "content_based"

          + config {
              + aggregate   = "all"
              + fields      = [
                  + "custom_details.alert_name",
                  + "custom_details.stage",
                ]
              + time_window = 300
              + timeout     = 0
            }
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.
pagerduty_service.test_service: Creating...
pagerduty_service.test_service: Creation complete after 1s [id=PLTNDMX]

Actual Behavior

When changing a property in alert_grouping_parameters.config.fields for example (or another one like aggregate,..), the TF-apply fails with an API error.
According to the PD-API spec (https://developer.pagerduty.com/api-reference/fbc6e9f4ef8eb-update-a-service), it's not allowed to set the alert_grouping_parameters.config.timeout property when using "type=content_based", but the provider actually always adds this property, independently from the type attribute.
Even though when setting alert_grouping_parameters.config.timeout to null, it's not omitted in the API-request.

I would assume that this has been introduced by a change/limitation on the PagerDuty-API side, because a few weeks/months ago we were able to set such a configuration on some of our services (when updating them via Terraform). Today, it doesn't work anymore.

$ terraform -chdir=tf/ apply -auto-approve
pagerduty_service.test_service: Refreshing state... [id=PLTNDMX]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # pagerduty_service.test_service will be updated in-place
  ~ resource "pagerduty_service" "test_service" {
        id                      = "PLTNDMX"
        name                    = "test-service"
        # (12 unchanged attributes hidden)

      ~ alert_grouping_parameters {
            # (1 unchanged attribute hidden)

          ~ config {
              ~ fields      = [
                    # (1 unchanged element hidden)
                    "custom_details.stage",
                  + "custom_details.field_3",
                ]
                # (3 unchanged attributes hidden)
            }
        }

        # (2 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
pagerduty_service.test_service: Modifying... [id=PLTNDMX]
╷
│ Error: Error reading: PLTNDMX: PUT API call to https://api.eu.pagerduty.com/services/PLTNDMX failed 400 Bad Request. Code: 2001, Errors: [Internal Server Error], Message: Invalid Input Provided
│ 
│   with pagerduty_service.test_service,
│   on service-test.tf line 1, in resource "pagerduty_service" "test_service":
│    1: resource "pagerduty_service" "test_service" {
│ 
╵

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply
  2. change the pagerduty_service.test_service.alert_grouping_parameters.config.fields property (add, change or remove an item)
  3. terraform apply
dkarl-wgs commented 1 month ago

Any updates on this? Currently, any change on a service with alert_grouping_parameters.type = content_based leads to an API Error. So we're not able to update our services (even a dot/comma/whatever changed in the title or description would fail when doing a terraform apply)

fyi @imjaroiswebdev, @cjgajard

NargiT commented 1 month ago

I do confirm this error.

Payload generated by terraform provider is wrong (I tried with 3.6.0 and 3.12.0)

generated

{
  "service": {
   "acknowledgement_timeout": null,
   "alert_creation": "create_alerts_and_incidents",
   "alert_grouping": "rules",
   "alert_grouping_parameters": {
    "type": "content_based",
    "config": {
     "timeout": 0,
     "time_window": 86400,
     "aggregate": "all",
     "fields": [
      "summary"
     ]
    }
   },
   "auto_pause_notifications_parameters": {
    "enabled": false,
    "timeout": null
   },
   "auto_resolve_timeout": 14400,
   "description": "⚡ by Terraform",
   "escalation_policy": {
    "id": "P3UVY7I",
    "type": "escalation_policy_reference"
   },
   "response_play": null,
   "incident_urgency_rule": {
    "type": "constant",
    "urgency": "severity_based"
   },
   "name": "foo|bar|prod"
  }
 }

receive : 400

{
  "error": {
    "message": "Invalid Input Provided",
    "code": 2001,
    "errors": [
      "Internal Server Error"
    ]
  }
}

Expected or working payload

{
  "service": {
    "id": "PWJOVNW",
    "name": "foo|bar|prod",
    "description": "⚡ by Terraform",
    "created_at": "2023-10-10T12:10:39+02:00",
    "updated_at": "2024-04-08T12:37:01+02:00",
    "status": "active",
    "teams": [
      {
        "id": "PDQZ8SJ",
        "type": "team_reference",
        "summary": "squad.marketdatanews",
        "self": "https://api.eu.pagerduty.com/teams/PDQZ8SJ",
        "html_url": "https://swissquote.eu.pagerduty.com/teams/PDQZ8SJ"
      }
    ],
    "alert_creation": "create_alerts_and_incidents",
    "addons": [],
    "scheduled_actions": [],
    "support_hours": null,
    "last_incident_timestamp": null,
    "escalation_policy": {
      "id": "P3UVY7I",
      "type": "escalation_policy_reference",
      "summary": "squad.marketdatanews and it.ops and apr.nightshift contract",
      "self": "https://api.eu.pagerduty.com/escalation_policies/P3UVY7I",
      "html_url": "https://swissquote.eu.pagerduty.com/escalation_policies/P3UVY7I"
    },
    "incident_urgency_rule": {
      "type": "constant",
      "urgency": "severity_based"
    },
    "acknowledgement_timeout": null,
    "auto_resolve_timeout": 14400,
    "alert_grouping": "rules",
    "alert_grouping_timeout": null,
    "alert_grouping_parameters": {
      "type": "content_based",
      "config": {
        "fields": [
          "summary"
        ],
        "aggregate": "all",
        "time_window": 86400,
        "recommended_time_window": 300
      },
      "is_global_configuration": false
    },
    "alert_grouping_rules": {
      "fields": [
        "summary"
      ],
      "aggregate": "all",
      "time_window": 86400,
      "recommended_time_window": 300
    },
    "integrations": [],
    "response_play": null,
    "type": "service",
    "summary": "foo|bar|prod",
    "self": "https://api.eu.pagerduty.com/services/PWJOVNW",
    "html_url": "https://swissquote.eu.pagerduty.com/service-directory/PWJOVNW"
  }
}

I was able to reproduce this using https://developer.pagerduty.com/api-reference.

Be aware, this is a blocker on our side. we have 100 services that cannot be updated because of this server side bug.

dkarl-wgs commented 1 month ago

I can confirm that this issue has been fixed with v3.12.1 -> https://github.com/PagerDuty/terraform-provider-pagerduty/releases/tag/v3.12.1

(https://github.com/PagerDuty/terraform-provider-pagerduty/pull/871)

Thank you @cjgajard!