confluentinc / terraform-provider-confluent

Terraform Provider for Confluent
Apache License 2.0
31 stars 64 forks source link

Plugin crashed when trying to create an ACL using Cloud API Key instead of Kafka API Key #120

Closed suraj2410 closed 2 years ago

suraj2410 commented 2 years ago

Version: 1.8.0

Terraform provider fails on kafka acl apply with the following message:

confluent_kafka_acl.test-service: Creating...
╷
│ Error: Request cancelled
│
│   with confluent_kafka_acl.test-service,
│   on test-service.tf line 1, in resource "confluent_kafka_acl" "test-service":
│    1: resource "confluent_kafka_acl" "test-service" {
│
│ The plugin.(*GRPCProvider).ApplyResourceChange request was cancelled.
╵

Stack trace from the terraform-provider-confluent_1.8.0 plugin:

panic: reflect: call of reflect.Value.FieldByName on ptr Value

goroutine 27 [running]:
reflect.flag.mustBe(...)
    /home/semaphore/.goenv/versions/1.18.0/src/reflect/value.go:223
reflect.Value.FieldByName({0x19e2a40?, 0xc0004b8128?, 0xc000118300?}, {0x1b69c5e?, 0x1a002e0?})
    /home/semaphore/.goenv/versions/1.18.0/src/reflect/value.go:1297 +0x1d6
github.com/confluentinc/terraform-provider-confluent/internal/provider.createDescriptiveError({0x1ceee60, 0xc000118300})
    src/github.com/confluentinc/terraform-provider-confluent/internal/provider/utils.go:347 +0x2be
github.com/confluentinc/terraform-provider-confluent/internal/provider.kafkaAclCreate({0x1cf2f50, 0xc00046d320}, 0xc00064bb00, {0x1b0b500?, 0xc000408c30?})
    src/github.com/confluentinc/terraform-provider-confluent/internal/provider/resource_kafka_acl.go:167 +0x41b
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).create(0xc0001d61c0, {0x1cf2f88, 0xc000565bf0}, 0xd?, {0x1b0b500, 0xc000408c30})
    pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/resource.go:707 +0x12e
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).Apply(0xc0001d61c0, {0x1cf2f88, 0xc000565bf0}, 0xc0001241a0, 0xc00064b980, {0x1b0b500, 0xc000408c30})
    pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/resource.go:837 +0xa7a
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ApplyResourceChange(0xc000305200, {0x1cf2ee0?, 0xc000473a40?}, 0xc0003d97c0)
    pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/grpc_provider.go:1021 +0xe3c
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ApplyResourceChange(0xc000244640, {0x1cf2f88?, 0xc000565410?}, 0xc000544a80)
    pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.9.0/tfprotov5/tf5server/server.go:812 +0x515
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ApplyResourceChange_Handler({0x1b2b720?, 0xc000244640}, {0x1cf2f88, 0xc000565410}, 0xc00046cc00, 0x0)
    pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.9.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:385 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001368c0, {0x1cf5d50, 0xc0000fc9c0}, 0xc00056ab40, 0xc000115140, 0x232aac0, 0x0)
    pkg/mod/google.golang.org/grpc@v1.45.0/server.go:1282 +0xccf
google.golang.org/grpc.(*Server).handleStream(0xc0001368c0, {0x1cf5d50, 0xc0000fc9c0}, 0xc00056ab40, 0x0)
    pkg/mod/google.golang.org/grpc@v1.45.0/server.go:1619 +0xa1b
google.golang.org/grpc.(*Server).serveStreams.func1.2()
    pkg/mod/google.golang.org/grpc@v1.45.0/server.go:921 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
    pkg/mod/google.golang.org/grpc@v1.45.0/server.go:919 +0x28a

Error: The terraform-provider-confluent_1.8.0 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

ERRO[0007] 1 error occurred:
    * exit status 1
resource "confluent_kafka_acl" "test-service" {
  kafka_cluster {
    id = var.cluster_id
  }
  resource_type = "TOPIC"
  resource_name = "test-service"
  pattern_type  = "LITERAL"
  principal     = "User:u-123456"
  host          = "*"
  operation     = "WRITE"
  permission    = "ALLOW"
}

Tried using curl locally with the same key and secret and it gets 201 but doesnt create any ACL in the cluster view

< HTTP/1.1 201 Created
< Date: Mon, 17 Oct 2022 12:12:23 GMT
< Location: https://cluster.confluent.cloud/kafka/v3/clusters/abc-58h7q/acls?resource_type=TOPIC&resource_name=test-servic&pattern_type=LITERAL&principal=User%3Asa-5d415d&host=*&operation=WRITE&permission=ALLOW
< Vary: Accept-Encoding, User-Agent
< Content-Length: 0
linouk23 commented 2 years ago

@suraj2410 thanks for reporting the issue!

I might have to take a look to understand why we can see an issue for confluent_kafka_acl.test-service.

That said, it's a very smart idea to send a curl and the way you can fix curl is to use int ID for a service account (User:54321) instead of a resource ID (User:sa-abc123). See this note to find out how to convert resourceID to intID.

Let me know if it helps!

suraj2410 commented 2 years ago

@linouk23 thanks for the comment!

i tried using both User:54321 and User:sa-abc123 and curl returns 201 but the ACL is not created in the confluent cloud console

I can make sure that the API key and secret is working since if i change it a bit in the CURL request it gives unauthorized error. This was just to make sure that we are not looking at an authorization issue since the account has org admin permissions

Just some additional logs i could get with DEBUG mode on

2022-10-17T12:35:49.160+0200 [DEBUG] confluent_kafka_acl.test-service: applying the planned Create change
2022-10-17T12:35:49.161+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: 2022/10/17 12:35:49 [DEBUG] GET https://api.confluent.cloud/service_accounts
2022-10-17T12:35:49.949+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: panic: reflect: call of reflect.Value.FieldByName on ptr Value
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: goroutine 30 [running]:
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: reflect.flag.mustBe(...)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   /home/semaphore/.goenv/versions/1.18.0/src/reflect/value.go:223
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: reflect.Value.FieldByName({0x19e2a40?, 0xc000334b48?, 0xc0000a2680?}, {0x1b69c5e?, 0x1a002e0?})
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   /home/semaphore/.goenv/versions/1.18.0/src/reflect/value.go:1297 +0x1d6
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/confluentinc/terraform-provider-confluent/internal/provider.createDescriptiveError({0x1ceee60, 0xc0000a2680})
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   src/github.com/confluentinc/terraform-provider-confluent/internal/provider/utils.go:347 +0x2be
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/confluentinc/terraform-provider-confluent/internal/provider.kafkaAclCreate({0x1cf2f50, 0xc00005b5c0}, 0xc0000f6400, {0x1b0b500?, 0xc000452a90?})
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   src/github.com/confluentinc/terraform-provider-confluent/internal/provider/resource_kafka_acl.go:167 +0x41b
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).create(0xc0001c21c0, {0x1cf2f88, 0xc000253dd0}, 0xd?, {0x1b0b500, 0xc000452a90})
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/resource.go:707 +0x12e
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).Apply(0xc0001c21c0, {0x1cf2f88, 0xc000253dd0}, 0xc000651040, 0xc0000f6180, {0x1b0b500, 0xc000452a90})
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/resource.go:837 +0xa7a
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ApplyResourceChange(0xc00000d1e8, {0x1cf2ee0?, 0xc0000a32c0?}, 0xc000428a50)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.16.0/helper/schema/grpc_provider.go:1021 +0xe3c
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ApplyResourceChange(0xc0005940a0, {0x1cf2f88?, 0xc0002535f0?}, 0xc0000de850)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.9.0/tfprotov5/tf5server/server.go:812 +0x515
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ApplyResourceChange_Handler({0x1b2b720?, 0xc0005940a0}, {0x1cf2f88, 0xc0002535f0}, 0xc00005aea0, 0x0)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.9.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:385 +0x170
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001368c0, {0x1cf5d50, 0xc000582d00}, 0xc00044c5a0, 0xc00010f4a0, 0x232aac0, 0x0)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/google.golang.org/grpc@v1.45.0/server.go:1282 +0xccf
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: google.golang.org/grpc.(*Server).handleStream(0xc0001368c0, {0x1cf5d50, 0xc000582d00}, 0xc00044c5a0, 0x0)
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/google.golang.org/grpc@v1.45.0/server.go:1619 +0xa1b
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: google.golang.org/grpc.(*Server).serveStreams.func1.2()
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/google.golang.org/grpc@v1.45.0/server.go:921 +0x98
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: created by google.golang.org/grpc.(*Server).serveStreams.func1
2022-10-17T12:35:49.950+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0:   pkg/mod/google.golang.org/grpc@v1.45.0/server.go:919 +0x28a
2022-10-17T12:35:49.951+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/confluentinc/confluent/1.8.0/darwin_amd64/terraform-provider-confluent_1.8.0 pid=42624 error="exit status 2"
2022-10-17T12:35:49.951+0200 [ERROR] plugin.(*GRPCProvider).ApplyResourceChange: error="rpc error: code = Unavailable desc = transport is closing"
2022-10-17T12:35:49.951+0200 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"
2022-10-17T12:35:49.951+0200 [ERROR] vertex "confluent_kafka_acl.test-service" error: Plugin did not respond

I was able to create a topic with TF but this issue comes with the ACL only

Hope that helps you!

linouk23 commented 2 years ago

@suraj2410 thanks for sharing these details!

For a second, let's forget TF issue and let's make sure your curl works as expected (i.e., creates an ACL).

It's a well-known issue that Kafka REST API returns 201 as long as input request doesn't have authz / malformed and it might be a great idea to double check for typos in the request.

  1. Could you confirm you can find your sa-abc123 when running confluent iam service-account list.

  2. Could you confirm you can find a corresponding intID by running the following curl from your the log you shared:

    curl --request GET --url "https://api.confluent.cloud/service_accounts" --header 'Authorization: Basic <TOKEN>'

    where TOKEN is generated from Cloud API Key.

Once you verify that intID is accurate it might be a good idea to verify other inputs: for example, I can see resource_name=test-servic, is that expected?

Another useful thing you could try would be to list ACLs and look at the output carefully.

Let me know if that helps.

suraj2410 commented 2 years ago

@linouk23 update: the CURL works with User:12345 sorry! my bad! had a typo there so the only issue is the resource with TF now

linouk23 commented 2 years ago

That sounds promising! Next it might be a good idea to try the following:

  1. Double check that var.cluster_id is in the form oflkc-abc123.
  2. Double check that kafka_rest_endpoint is in the form of https://pkc-012345.us-central1.gcp.confluent.cloud:443.
  3. Could you switch from managing-single-cluster to something like
    
    terraform {
    required_providers {
    confluent = {
      source  = "confluentinc/confluent"
      version = "1.8.0"
    }
    }
    }

provider "confluent" { cloud_api_key = var.confluent_cloud_api_key cloud_api_secret = var.confluent_cloud_api_secret }

resource "confluent_kafka_acl" "app-producer-write-on-topic" { kafka_cluster { id = confluent_kafka_cluster.basic.id } resource_type = "TOPIC" resource_name = "test-service" pattern_type = "LITERAL" principal = "User:sa-abc123" host = "*" operation = "WRITE" permission = "ALLOW" rest_endpoint = confluent_kafka_cluster.basic.rest_endpoint credentials { key = confluent_api_key.app-manager-kafka-api-key.id secret = confluent_api_key.app-manager-kafka-api-key.secret } }


and rerun `terraform apply`?
suraj2410 commented 2 years ago

@linouk23 thanks for the reply and it works now! there was some issue with the confluent key and secret which perhaps CURL didn't give back expected results and just 201

so was hard to figure out and what gave a clue was this debug line

2022-10-17T12:35:49.160+0200 [DEBUG] confluent_kafka_acl.test-service: applying the planned Create change
2022-10-17T12:35:49.161+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: 2022/10/17 12:35:49 [DEBUG] GET https://api.confluent.cloud/service_accounts
2022-10-17T12:35:49.949+0200 [DEBUG] provider.terraform-provider-confluent_1.8.0: panic: reflect: call of reflect.Value.FieldByName on ptr Value

May be we can have some better feedback from the plugin when this happens like this time the creds that i had wasnt able to get the service accounts

linouk23 commented 2 years ago

@suraj2410 that's great to hear! Could you specify what exactly you updated to make it work? Was provided Cloud (or Kafka?) API Key malformed or something so it's likely 401 from the backend that TF Provider couldn't process for some reason?

I'm wondering if we manage to get to the API error that is being returned by the backend.

linouk23 commented 2 years ago

cc @suraj2410 ⬆️

suraj2410 commented 2 years ago

@linouk23 i had to change to use cloud api key instead of the kafka api key but i was using same keys for both! but yea the plugin couldnt somehow process this for the malformed cloud api key

suraj2410 commented 2 years ago

@linouk23 also now we have a new issue with import so not sure what i am missing with ACL here with the User id and works with the sa format

confluent_kafka_acl.test_acl: Importing from ID "abc-12345/TOPIC#test_topic#LITERAL#User:u-123456#*#WRITE#ALLOW"...
╷
│ Error: error importing Kafka ACLs "abc-12345/TOPIC#test_topic#LITERAL#User:u-123456#*#WRITE#ALLOW": the user with resource ID=u-123456 was not found
│
│

But this works if i specify User:sa-123456

any ideas

linouk23 commented 2 years ago

opened https://github.com/confluentinc/terraform-provider-confluent/issues/123 for your last message

i had to change to use cloud api key instead of the kafka api key but i was using same keys for both! but yea the plugin couldnt somehow process this for the malformed cloud api key

that's great to hear! I'll try to reproduce it with using Cloud API Key instead of Kafka API Key.

linouk23 commented 2 years ago

also now we have a new issue with import so not sure what i am missing with ACL here with the User id and works with the sa format

@suraj2410 could you double check you can see a user with such an ID on the Confluent Cloud UI or when running confluent iam user list using Confluent CLI?

I'm not able to reproduce this issue:

terraform import confluent_kafka_acl.app-consumer-read-on-group-user "lkc-v7dx9z/GROUP#confluent_cli_consumer_#PREFIXED#User:u-7yx6mp#*#READ#ALLOW"
confluent_kafka_acl.app-consumer-read-on-group-user: Importing from ID "lkc-v7dx9z/GROUP#confluent_cli_consumer_#PREFIXED#User:u-7yx6mp#*#READ#ALLOW"...
confluent_kafka_acl.app-consumer-read-on-group-user: Import prepared!
  Prepared confluent_kafka_acl for import
confluent_kafka_acl.app-consumer-read-on-group-user: Refreshing state... [id=lkc-v7dx9z/GROUP#confluent_cli_consumer_#PREFIXED#User:u-7yx6mp#*#READ#ALLOW]

Import successful!

The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

$ terraform plan
...
No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
linouk23 commented 2 years ago

@suraj2410 could you confirm that your var.cluster_id starts with lkc- and that you're using version 1.8.0 of TF Provider for Confluent?

I'm trying to reproduce the issue with no success:

confluent_kafka_topic.orders: Creating...
╷
│ Error: error creating Kafka Topic: 401 Unauthorized: Unauthorized
│ 
│   with confluent_kafka_topic.orders,
│   on main.tf line 16, in resource "confluent_kafka_topic" "orders":
│   16: resource "confluent_kafka_topic" "orders" {

I also tried using a malformed Kafka REST endpoint but didn't manage to crash TF Provider 🤔 :

│ Error: error creating Kafka Topic: Post "https://pkc-ojvxy.us-east-1.aw.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics": POST https://pkc-ojvxy.us-east-1.aw.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics giving up after 5 attempt(s): Post "https://pkc-ojvxy.us-east-1.aw.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics": dial tcp: lookup pkc-ojvxy.us-east-1.aw.stag.cpdev.cloud on 127.0.2.2:53: no such host

│ Error: error creating Kafka Topic: Post "http://pkc-ojvxy.us-east-1.aws.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics": POST http://pkc-ojvxy.us-east-1.aws.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics giving up after 5 attempt(s): Post "http://pkc-ojvxy.us-east-1.aws.stag.cpdev.cloud:443/kafka/v3/clusters/lkc-n0xw3/topics": net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x03\x00\x02\x02P"
linouk23 commented 2 years ago

We've just released version 1.9.0 of TF Provider for Confluent that should fix this problem, feel free to reopen the issue if you manage to reproduce it again.