magodo / terraform-provider-restful

Terraform provider to manage RESTful resources
https://registry.terraform.io/providers/magodo/restful
Mozilla Public License 2.0
15 stars 5 forks source link

State Gets Stored as created when create_poll fails #108

Closed hongkongkiwi closed 3 months ago

hongkongkiwi commented 3 months ago

I've noticed that the state shows that an item was created even if the create_poll fails.

I think it's because the create command actually returns 200, so maybe it's flagged as passed and it's saved into the created state in terraform.

Subsequent runs will actually try to update it since it thinks that the creation was successful (even though the create poll was clearly failed.

Here's my terraform logs:

# restful_resource.webhook_completed will be created
  + resource "restful_resource" "webhook_completed" {
      + body        = {
          + customProperties = {}
          + description      = "Here is an amazing description"
          + displayName      = "Cool Display Name"
          + events           = {
              + transcriptionCompletion = true
            }
          + properties       = {
              + apiVersion = "v3.2"
            }
          + webUrl           = "https://example.com"
        }
      + delete_path = "$unescape(body.self)"
      + header      = {
          + "Ocp-Apim-Subscription-Key" = (sensitive value)
        }
      + id          = (known after apply)
      + output      = (known after apply)
      + path        = "/"
      + poll_create = {
          + default_delay_sec = 10
          + status            = {
              + pending = [
                  + "Pending",
                ]
              + success = "Succeeded"
            }
          + status_locator    = "body.status"
        }
      + poll_delete = {
          + default_delay_sec = 10
          + status            = {
              + success = "204"
            }
          + status_locator    = "code"
        }
      + poll_update = {
          + default_delay_sec = 10
          + status            = {
              + pending = [
                  + "Pending",
                ]
              + success = "Succeeded"
            }
          + status_locator    = "body.status"
        }
      + read_path   = "$unescape(body.self)"
      + update_path = "$unescape(body.self)"
    }

Here's the log output:

restful_resource.webhook_completed: Creating...

│ Error: Provider returned invalid result object after apply
│
│ After the apply operation, the provider still indicated an unknown value for restful_resource.webhoo_completed.output. All values must
│ be known after apply, so this is always a bug in the provider and should be reported in the provider's own repository. OpenTofu will still save the other
│ known object values in the state.
╵
╷
│ Error: Create: Polling failure
│
│   with restful_resource.webhook_completed,
│   on azure_audio_transcription.tf line 22, in resource "restful_resource" "webhook_completed":
│   22: resource "restful_resource" "webhook_completed" {
│
│ Unexpected status "Failed". Full response: {
│   "displayName": "Cool Display Name",
│   "self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/webhooks/01DEEE84-9145-4220-A5E2-4FCA4E88B1BE",
│   "links": {
│     "ping": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/webhooks/01DEEE84-9145-4220-A5E2-4FCA4E88B1BE:ping",
│     "test": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/webhooks/01DEEE84-9145-4220-A5E2-4FCA4E88B1BE:test"
│   },
│   "properties": {
│     "apiVersion": "v3.2",
│     "error": {
│       "code": "Undefined",
│       "message": "The web hook failed completing the challenge/response handshake at 8/7/2024 5:08:41 PM. Challenge response status code was not 200/OK, co
de: Forbidden"
│     }
│   },
│   "webUrl": "https://example.com",
│   "events": {
│     "transcriptionCompletion": true
│   },
│   "description": "Here is an amazing description",
│   "createdDateTime": "2024-08-07T17:08:41Z",
│   "lastActionDateTime": "2024-08-07T17:08:41Z",
│   "status": "Failed"
│ }
magodo commented 3 months ago

@hongkongkiwi This is intentional as once the initial create request succeeded, the resource is recorded in most platforms. But since the polling failed, it makes terraform mark this resource as "tainted". A tainted resource will be replaced in the next apply, which saves the user's effort to import than delete the resource manually to make the existance check happy.

hongkongkiwi commented 3 months ago

I may have the wrong approach here, but the problem I'm having is because the item exists in state but is tainted, then terraform tries to delete it upon the next run because it thinks it exists but needs to be recreated which produces a 404.

Normally, a delete command should return a 204 code which indicates success (that's what my delete_poll value is set to).

╷
│ Error: Delete: Polling failure
│
│ Unexpected status "404". Full response: {
│   "code": "NotFound",
│   "message": "The specified entity cannot be found."
│ }
╵

This error makes sense because the object was never created (as create_poll clearly shows).

I could use 404 as a success value for delete_poll to fix this issue? That could work around this (especially with #109 to allow 204 & 404 as success codes).

I guess I had an expectation that failed creates shouldn't even end up in the state, so that upon next run it would try to create again rather than try to delete a non-existent object.

hongkongkiwi commented 3 months ago

I guess your right the Microsoft API is a bit weird in that it gives a 200 response for an error and you only know it by checking the body, I guess this is an anomaly and not a usual case.

I guess I can work work around it with using delete_poll and success code as 404.

magodo commented 3 months ago

@hongkongkiwi

I may have the wrong approach here, but the problem I'm having is because the item exists in state but is tainted, then terraform tries to delete it upon the next run because it thinks it exists but needs to be recreated which produces a 404.

If you are not suppressing refresh in your apply, then the refresh will do a read first, in which case 404 will cause the provider mark the resource as deleted, and won't do an explicit delete then.

https://github.com/magodo/terraform-provider-restful/blob/53466abc4eff09356ff0cb50b05fd805ed9cf37c/internal/provider/resource.go#L722-L733

I guess I can work work around it with using delete_poll and success code as 404.

Unfortunately, this isn't something I want to adopt as the poll is a predictable operation, whose successful result also anticipates to be a certain one, instead of multiple.

hongkongkiwi commented 3 months ago

If you are not suppressing refresh in your apply, then the refresh will do a read first, in which case 404 will cause the provider mark the resource as deleted, and won't do an explicit delete then.

Makes sense. So my workaround for this (for anybody else who finds it) is to do the following, it seems to work well :)

  poll_delete = {
    status_locator = "code"
    status = {
      success = "404"
      pending = ["204"]
    }
  }