andrewbaxter / terraform-provider-fly

Terraform provider for the Fly.io API
https://registry.terraform.io/providers/andrewbaxter/fly/latest
BSD 3-Clause "New" or "Revised" License
19 stars 2 forks source link

[Bug]: Can't release IP Address #3

Closed kieranbrown closed 6 months ago

kieranbrown commented 6 months ago

Hey @andrewbaxter, awesome project and appreciate your fork! I'm in the early stages of using fly.io and I've noticed that you can't release an IP Address, deleting the resource gives the following error:

│ Error: Error looking up ip address (app [some-random-app-name], addr [])
│ 
│   with fly_ip.this,
│   on main.tf line 20, in resource "fly_ip" "this":
│   20: resource "fly_ip" "this" {
│ 
│ Upstream graphql error: Could not find IPAssignment
│ Graphql path: app.ipAddress

Minimal code to replicate

terraform {
  required_providers {
    fly = {
      source = "andrewbaxter/fly"
      version = "0.1.11"
    }
  }
}

provider "fly" {
  fly_api_token = "<redacted>"
}

resource "fly_app" "this" {
  name = "some-random-app-name"
  org  = "your-org-name"

  assign_shared_ip_address = false
}

resource "fly_ip" "this" {
  app    = fly_app.this.name
  type   = "v4"
  region = "global"
}

If you apply the above example, and run the following commands, you should be able to replicate the issue

terraform apply
terraform destroy -target fly_ip.this
andrewbaxter commented 6 months ago

Thanks for the report and the test case! I'll try to take a look soon.

I know deleting ips was working before IIRC... the graphql definitions have changed slightly vs flyctl but the existing parameters are still there. It looks like it adds a new ip address id which

My plan to test is to run flyctl with graphql logging I hacked in and see what it's sending vs what the provider is sending. If it's using the ip address id instead of the ip itself we could add that field I think.

andrewbaxter commented 6 months ago

FWIW here's the flyctl code: https://github.com/superfly/flyctl/blob/9255529e9f423e6aca6eba64457187b6aead53fa/internal/command/ips/release.go#L15

I think it uses https://github.com/superfly/fly-go which is (relatively) new. The release code is here: https://github.com/superfly/fly-go/blob/main/resource_ip_addresses.go#L112

Trying that library might be an option, although it doesn't look like it's doing anything different.

andrewbaxter commented 6 months ago

Ah, this is failing during the "read" phase before it makes changes, so it isn't releasing the ip yet. That was definitely working before... but the situation is complicated by shared ips even if they're not being used.

Error looking up ip address (app [testreleaseip], addr []) - addr is empty, that seems like it's likely the issue. I think during create maybe we didn't get the ip address?

andrewbaxter commented 6 months ago

From TF_LOG=DEBUG terraform apply... ugly, but:

2024-05-11T15:44:17.326+0900 [DEBUG] provider.terraform-provider-fly: HTTP REQUEST: proto=HTTP/1.1 tf_resource_type=fly_ip @module=fly method=POST tf_req_id=32672630-5b11-78f8-166d-afd00394fabc tf_rpc=ApplyResourceChange tf_provider_addr=registry.terraform.io/fly-apps/fly url=https://api.fly.io/graphql @caller=/.../terraform-provider-fly/utils/clients.go:62 body="{\"query\":\"\nmutation AllocateIpAddress ($app: ID!, $region: String, $addrType: IPAddressType!) {\n\tallocateIpAddress(input: {appId:$app,region:$region,type:$addrType}) {\n\t\tipAddress {\n\t\t\tid\n\t\t\ttype\n\t\t\taddress\n\t\t\tregion\n\t\t}\n\t\tapp {\n\t\t\tsharedIpAddress\n\t\t}\n\t}\n}\n\",\"variables\":{\"app\":\"testreleaseip\",\"region\":\"global\",\"addrType\":\"v4\"},\"operationName\":\"AllocateIpAddress\"}" headers="map[Authorization:[Bearer redacted] Content-Type:[application/json]]" timestamp="2024-05-11T15:44:17.325+0900"
2024-05-11T15:44:17.713+0900 [DEBUG] provider.terraform-provider-fly: HTTP RESPONSE: proto=HTTP/1.1 tf_req_id=32672630-5b11-78f8-166d-afd00394fabc @caller=/.../terraform-provider-fly/utils/clients.go:81 @module=fly body="" code="200 OK" headers="map[Cache-Control:[max-age=0, private, must-revalidate] Content-Type:[application/json; charset=utf-8] Date:[Sat, 11 May 2024 06:44:16 GMT] Etag:[W/\"8ffddaf8b1c2734e80406a12f511f85b\"] Fly-Request-Id:[01HXK6AASJ069B96E7D8178HT3-nrt] Fly-Span-Id:[b8dba11cc944322e] Fly-Trace-Id:[529976f56042725edeb07dc90f2ea6ca] Server:[Fly/00e000b6 (2024-05-10)] Strict-Transport-Security:[max-age=31556952] Vary:[Accept-Encoding] Via:[1.1 fly.io] X-Content-Type-Options:[nosniff] X-Download-Options:[noopen] X-Frame-Options:[sameorigin] X-Permitted-Cross-Domain-Policies:[none] X-Request-Id:[90205e4b-d7f5-4260-9a40-d792f8c18f29] X-Runtime:[0.205880] X-Xss-Protection:[1; mode=block]]" tf_provider_addr=registry.terraform.io/fly-apps/fly tf_resource_type=fly_ip method=POST tf_rpc=ApplyResourceChange url=https://api.fly.io/graphql timestamp="2024-05-11T15:44:17.712+0900"

body="" code="200 OK"

Body is empty, which I guess isn't an issue for graphql?

The response type (graphql) is

# Autogenerated return type of AllocateIPAddress
type AllocateIPAddressPayload {
  app: App!

  # A unique identifier for the client performing the mutation.
  clientMutationId: String
  ipAddress: IPAddress!
}

... ah yup, upstream it says

"""
Autogenerated return type of AllocateIPAddress.
"""
type AllocateIPAddressPayload {
  app: App!

  """
  A unique identifier for the client performing the mutation.
  """
  clientMutationId: String
  ipAddress: IPAddress
}

I guess ipAddress is now nullable? I'm not sure that's the issue here, but if that's the case how is one to identify the ip address that's just been allocated?

andrewbaxter commented 6 months ago

Here's what I got from a slightly old flyctl:

2024/05/11 16:07:26 GRAPHQL -> POST HTTP/1.1 https://api.fly.io/graphql
Headers: http.Header{"Accept":[]string{"application/json; charset=utf-8"}, "Authorization":[]string{"Bearer xyz"}, "Content-Type":[]string{"application/json; charset=utf-8"}, "User-Agent":[]string{"flyctl/2023.12.12-master.1702393204"}}
Body:
{"query":"mutation($input: AllocateIPAddressInput!) { allocateIpAddress(input: $input) { ipAddress { id address type region createdAt } } }","variables":{"input":{"appId":"testreleaseip","type":"v4","region":"global"}}}
2024/05/11 16:07:26 GRAPHQL <- POST HTTP/1.1 https://api.fly.io/graphql = 200 OK
Headers: http.Header{"Cache-Control":[]string{"max-age=0, private, must-revalidate"}, "Content-Type":[]string{"application/json; charset=utf-8"}, "Date":[]string{"Sat, 11 May 2024 07:07:26 GMT"}, "Etag":[]string{"W/\"a38248895ce60763c07d8b540371405b\""}, "Fly-Request-Id":[]string{"01HXK7MQMW5TDYQD04DHJDDC90-nrt"}, "Fly-Span-Id":[]string{"c9e8e1e089d747e8"}, "Fly-Trace-Id":[]string{"c1d972c288abd3af3c90b98ceea91ef9"}, "Server":[]string{"Fly/00e000b6 (2024-05-10)"}, "Strict-Transport-Security":[]string{"max-age=31556952"}, "Vary":[]string{"Accept", "Accept-Encoding"}, "Via":[]string{"1.1 fly.io"}, "X-Content-Type-Options":[]string{"nosniff"}, "X-Download-Options":[]string{"noopen"}, "X-Frame-Options":[]string{"sameorigin"}, "X-Permitted-Cross-Domain-Policies":[]string{"none"}, "X-Request-Id":[]string{"2246e232-12b2-4eb3-a229-b4d40f7b3ad8"}, "X-Runtime":[]string{"0.063613"}, "X-Xss-Protection":[]string{"1; mode=block"}}
Body:
{"data":{"allocateIpAddress":{"ipAddress":{"id":"ip_ly4m2nlnkdk9k3oe","address":"66.51.120.222","type":"v4","region":"global","createdAt":"2024-05-11T07:07:26Z"}}}}

~I see appId here vs app above.~ Oh no, it's just slightly indirect above.

All the parameters look identical otherwise... there are differences in the graphql query (app { sharedIpAddress } etc)

andrewbaxter commented 6 months ago
{"query":"
mutation AllocateIpAddress ($app: ID!, $region: String, $addrType: IPAddressType!) {
    allocateIpAddress(input: {appId:$app,region:$region,type:$addrType}) {
        ipAddress {
            id
            type
            address
            region
        }
        app {
            sharedIpAddress
        }
    }
}
","variables":{"app":"testreleaseip","region":"global","addrType":"v4"},"operationName":"AllocateIpAddress"}

vs

{"query":"mutation($input: AllocateIPAddressInput!) { allocateIpAddress(input: $input) { ipAddress { id address type region createdAt } } }","variables":{"input":{"appId":"testreleaseip","type":"v4","region":"global"}}}

My gut says operationName is unrelated.

I'm not so familiar with graphql but I think the syntax is something like: mutation input {operation(input) { output } }. AllocateIPAddressInput must be relying on some known mapping, but I imagine the input ids must match up so appId is sent as appId etc.

I think that means the inputs are the same: appId, type, region. The main difference seems to be output - the terraform provider wants app { sharedIpAddress } in addition to the other fields, while the flyctl one doesn't have that but instead has createdAt.

IIRC we need sharedIpAddress because when you have type v4_shared it creates the ip address as a property of the app not as a new ip address object, but that seems like the most significant difference here.

andrewbaxter commented 6 months ago

Tried it with

mutation AllocateIpAddress ($input: AllocateIPAddressInput!) {
    allocateIpAddress(input: $input) {
        ipAddress {
            id
            address
            type
            region
            createdAt
        }
    }
}
","variables":{"input":{"clientMutationId":"","appId":"testreleaseip","type":"v4","region":"global"}},"operationName":"AllocateIpAddress"}

but no change (empty body in response). That was copied/pasted from upstream, I just added AllocateIpAddress after mutate because otherwise it was complaining about anonymous things (I think that's just a consequence of having all the operations in one file - each one needs a name).

andrewbaxter commented 6 months ago

curl -X POST https://api.fly.io/graphql -H 'Content-Type: application/json' -H "Authorization: Bearer abcd" --data '{"query":"mutation AllocateIpAddress ($input: AllocateIPAddressInput!) { allocateIpAddress(input: $input) { ipAddress { id address type region createdAt } } }","variables":{"input":{"clientMutationId":"","appId":"testreleaseip","type":"v4","region":"global"}},"operationName":"AllocateIpAddress"}' works! I got a non-empty body with the ip address id, etc!

It's the same data as above, but I removed all the new lines from the json string... I'm not sure how the graphql library would mis-encode a json string though, I assumed that was escaping issues with how terraform logs things.

andrewbaxter commented 6 months ago

It seems to be a common violation in graphql-land: https://github.com/quarkusio/quarkus/issues/17667 https://stackoverflow.com/questions/53469321/correct-format-for-multiline-data-in-curl-query-to-graphql-endpoint

andrewbaxter commented 6 months ago

...query\":\"\n certainly looks wrong (newline right after opening quote). My guess is that I'm not just misreading this, it's actually sending mis-escaped json, and the upstream decoder accepted this. Then they replaced the upstream decoder with something more strict, and now it's refusing the payload but there's an issue in the pipeline and the errors aren't being sent back to the caller.

I looked at the graphqlient code though and Body is string and it gets passed through json.Marshal which shouldn't have any way to get around the escaping.

andrewbaxter commented 6 months ago

I dumped the json directly working around TF's logging, and TF was just messing up the quoting it seems:

{"query":"\nmutation AllocateIpAddress ($input: AllocateIPAddressInput!) {\n\tallocateIpAddress(input: $input) {\n\t\tipAddress {\n\t\t\tid\n\t\t\taddress\n\t\t\ttype\n\t\t\tregion\n\t\t\tcreatedAt\n\t\t}\n\t}\n}\n","variables":{"input":{"clientMutationId":"","appId":"testreleaseip","type":"v4","region":"global"}},"operationName":"AllocateIpAddress"}

That looks valid to me.

And running

curl -X POST https://api.fly.io/graphql -H 'Content-Type: application/json' -H "Authorization: Bearer ." --data '{"query":"\nmutation AllocateIpAddress ($input: AllocateIPAddressInput!) {\n\tallocateIpAddress(input: $input) {\n\t\tipAddress {\n\t\t\tid\n\t\t\taddress\n\t\t\ttype\n\t\t\tregion\n\t\t\tcreatedAt\n\t\t}\n\t}\n}\n","variables":{"input":{"clientMutationId":"","appId":"testreleaseip","type":"v4","region":"global"}},"operationName":"AllocateIpAddress"}'

works (exact payload pasted into curl).

I'm stumped, timing issue?

andrewbaxter commented 6 months ago

Ahh... I was logging the request body twice instead of the response body the 2nd time :facepalm: ...

2024-05-11T17:27:37.877+0900 [DEBUG] provider.terraform-provider-fly: HTTP RESPONSE: @caller=/.../terraform-provider-fly/utils/clients.go:81 tf_resource_type=fly_ip headers="map[Cache-Control:[max-age=0, private, must-revalidate] Content-Type:[application/json; charset=utf-8] Date:[Sat, 11 May 2024 08:27:37 GMT] Etag:[W/\"5080c85f49f44abdc054faf9df9a4740\"] Fly-Request-Id:[01HXKC7HNP7Z30GC3HPKGJWNS1-nrt] Fly-Span-Id:[c0e6da88ffabfab4] Fly-Trace-Id:[d30cdff2b87deea0abda66d86a41b1e9] Server:[Fly/00e000b6 (2024-05-10)] Strict-Transport-Security:[max-age=31556952] Vary:[Accept-Encoding] Via:[1.1 fly.io] X-Content-Type-Options:[nosniff] X-Download-Options:[noopen] X-Frame-Options:[sameorigin] X-Permitted-Cross-Domain-Policies:[none] X-Request-Id:[919f9678-84bb-4122-b298-1c539276b46c] X-Runtime:[0.172164] X-Xss-Protection:[1; mode=block]]" method=POST tf_provider_addr=registry.terraform.io/fly-apps/fly url=https://api.fly.io/graphql body="{\"data\":{\"allocateIpAddress\":{\"ipAddress\":{\"id\":\"ip_woml9zy03ny20e6y\",\"address\":\"213.188.207.240\",\"type\":\"v4\",\"region\":\"global\",\"createdAt\":\"2024-05-11T08:27:37Z\"}}}}" proto=HTTP/1.1 @module=fly code="200 OK" tf_req_id=cbe092f4-65c0-7983-abb2-e2aeaa89e61f tf_rpc=ApplyResourceChange timestamp="2024-05-11T17:27:37.877+0900"
andrewbaxter commented 6 months ago

And yep, it just wasn't storing the address right during the create operation. I just re-tested and create and delete work fine.

andrewbaxter commented 6 months ago

I fixed it here: https://github.com/andrewbaxter/terraform-provider-fly/commit/867b3722338982361a164eaec3a7323dd8543e12

And pushed it in https://github.com/andrewbaxter/terraform-provider-fly/releases/tag/v0.1.12 - can you try that and let me know if it fixes the issue?

kieranbrown commented 6 months ago

I fixed it here: 867b372

And pushed it in v0.1.12 (release) - can you try that and let me know if it fixes the issue?

Awesome! I submitted this bug report just before I went to sleep, I never expected to wake up to this resolved 😂 I need to run a couple of errands this morning but I'll check it out in a few hours. Thanks for your hard work!

andrewbaxter commented 6 months ago

Oh yeah, great! And no rush! It felt like a rather core behavior, and once I started poking at it I couldn't stop...

dirien commented 5 months ago

I still get the same error:

terraform init -upgrade         

Initializing the backend...

Initializing provider plugins...
- Finding andrewbaxter/fly versions matching "0.1.13"...
- Installing andrewbaxter/fly v0.1.13...
- Installed andrewbaxter/fly v0.1.13 (self-signed, key ID E3B71A86562F4FC3)

...
terraform destroy --auto-approve
fly_app.this: Refreshing state... [id=my-dirien-app2]
fly_ip.this: Refreshing state... [id=ip_3jwv94dn4459p506]
╷
│ Error: Error looking up ip address (app [my-dirien-app2], addr [])
│ 
│   with fly_ip.this,
│   on main.tf line 17, in resource "fly_ip" "this":
│   17: resource "fly_ip" "this" {
│ 
│ Upstream graphql error: Could not find IPAssignment
│ Graphql path: app.ipAddress
╵

@kieranbrown did it worked for you?

andrewbaxter commented 5 months ago

@dirien I can dig in more in a bit, but can you check your state file and see if the "address" field in the ip resource is properly populated? IIRC this was actually a bug in create rather than delete -- that is, when the resource was created it didn't properly store the address.

So I think for existing resources you'll need to manually fix up the state file with the ip address... If that's correct, sorry, I should have written a TLDR after my stream of consciousness dump above.

dirien commented 5 months ago

@andrewbaxter thanks for the feedback. You are right, if the resource is created by a previous version you need to do some state surgery. Thanks that helped a lot! And sorry for "resurrecting" this issue.

andrewbaxter commented 5 months ago

No not at all! That was important information I forgot to add, so thanks!