confluentinc / terraform-provider-confluentcloud

Confluent Cloud Terraform Provider is deprecated in favor of Confluent Terraform Provider
https://registry.terraform.io/providers/confluentinc/confluentcloud/latest/docs
52 stars 23 forks source link

Terraform Scripts fails, error indicates plugin crashed #53

Closed VipulZopSmart closed 2 years ago

VipulZopSmart commented 2 years ago

Getting this error while running the below terraform script for provisioning of topic in Confluent kafka cluster. Giving the cluster_id , secrets as inputs.

╷
│ Error: Plugin did not respond
│ 
│   with confluentcloud_kafka_topic.topics["confluent-test-topic"],
│   on topics.tf line 1, in resource "confluentcloud_kafka_topic" "topics":
│    1: resource "confluentcloud_kafka_topic" "topics" {
│ 
│ The plugin encountered an error, and failed to respond to the
│ plugin.(*GRPCProvider).ApplyResourceChange call. The plugin logs may
│ contain more details.
╵
Releasing state lock. This may take a few moments...

Stack trace from the terraform-provider-confluentcloud_0.5.0 plugin:

panic: reflect: call of reflect.Value.FieldByName on zero Value

goroutine 67 [running]:
reflect.flag.mustBe(...)
        /usr/local/golang/1.16/go/src/reflect/value.go:221
reflect.Value.FieldByName(0x0, 0x0, 0x0, 0x104ad1e0e, 0x6, 0x0, 0x1b6, 0x0)
        /usr/local/golang/1.16/go/src/reflect/value.go:903 +0x190
github.com/confluentinc/terraform-provider-ccloud/internal/provider.createDiagnosticsWithDetails(0x104d8bcb8, 0x14000332780, 0x1400008f588, 0x3, 0x3)
        src/github.com/confluentinc/terraform-provider-confluentcloud/internal/provider/utils.go:304 +0x240
github.com/confluentinc/terraform-provider-ccloud/internal/provider.kafkaTopicCreate(0x104d9b188, 0x1400009d020, 0x14000689480, 0x104cc3ea0, 0x14000182540, 0x140006cea80, 0x14000689300, 0x10482b700)
        src/github.com/confluentinc/terraform-provider-confluentcloud/internal/provider/resource_kafka_topic.go:141 +0x374
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).create(0x14000181500, 0x104d9b118, 0x14000416880, 0x14000689480, 0x104cc3ea0, 0x14000182540, 0x0, 0x0, 0x0)
        pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.10.1/helper/schema/resource.go:341 +0x118
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).Apply(0x14000181500, 0x104d9b118, 0x14000416880, 0x14000328680, 0x14000689300, 0x104cc3ea0, 0x14000182540, 0x0, 0x0, 0x0, ...)
        pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.10.1/helper/schema/resource.go:467 +0x4ec
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ApplyResourceChange(0x1400000d470, 0x104d9b118, 0x14000416880, 0x14000392550, 0x104adad89, 0x12, 0x0)
        pkg/mod/github.com/hashicorp/terraform-plugin-sdk/v2@v2.10.1/helper/schema/grpc_provider.go:977 +0x870
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ApplyResourceChange(0x14000688200, 0x104d9b1c0, 0x14000416880, 0x14000198000, 0x0, 0x0, 0x0)
        pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.5.0/tfprotov5/tf5server/server.go:603 +0x338
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ApplyResourceChange_Handler(0x104d3ef20, 0x14000688200, 0x104d9b1c0, 0x140005cc8a0, 0x1400009c6c0, 0x0, 0x104d9b1c0, 0x140005cc8a0, 0x1400066a600, 0x2e0)
        pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.5.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:380 +0x1c8
google.golang.org/grpc.(*Server).processUnaryRPC(0x140002be8c0, 0x104da2c38, 0x14000092d80, 0x140006de100, 0x140006865d0, 0x105235900, 0x0, 0x0, 0x0)
        pkg/mod/google.golang.org/grpc@v1.33.2/server.go:1210 +0x3e8
google.golang.org/grpc.(*Server).handleStream(0x140002be8c0, 0x104da2c38, 0x14000092d80, 0x140006de100, 0x0)
        pkg/mod/google.golang.org/grpc@v1.33.2/server.go:1533 +0xa50
google.golang.org/grpc.(*Server).serveStreams.func1.2(0x140003021b0, 0x140002be8c0, 0x104da2c38, 0x14000092d80, 0x140006de100)
        pkg/mod/google.golang.org/grpc@v1.33.2/server.go:871 +0x94
created by google.golang.org/grpc.(*Server).serveStreams.func1
        pkg/mod/google.golang.org/grpc@v1.33.2/server.go:869 +0x1f8

Error: The terraform-provider-confluentcloud_0.5.0 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.
VipulZopSmart commented 2 years ago

using this Terraform script -

resource "confluentcloud_kafka_topic" "topics" {

  for_each         = { for topic in var.kafka_topics : topic.topic_name => topic }
  kafka_cluster    = var.cluster_id
  topic_name       = each.key
  partitions_count = each.value.partitions_count == 0 ? 2 : each.value.partitions_count
  http_endpoint    = var.http_endpoint
  config = {
    "cleanup.policy"    = "compact"
    "max.message.bytes" = "12345"
    "retention.ms"      = "67890"
  }
  credentials {
    key    = var.kafka_api_key
    secret = var.kafka_secret_key
  }
}
linouk23 commented 2 years ago

@VipulZopSmart

Thanks for providing a sample output! The "good" news is even though TF provider crashed (which we'll fix in our next release), it is the result of the underlying HTTP error from Kafka REST API. More specifically, I suspect Kafka REST API returned 400 / 500 error:

curl --request POST \
  --url 'https://pkc-***.{region}.{cloud}.confluent.cloud:443/kafka/v3/clusters/lkc-abc123/topics' \
  --header 'Authorization: Basic <TOKEN>' \
  --header 'content-type: application/json' \
  --data '{"topic_name":"kostya-test-topic 1112","partitions_count":"2","replication_factor":"3"}'
{"error_code":500,"message":"Internal Server Error"}% # e.g., the topic name cannot contain whitespaces -- the error is definitely not super descriptive

where <TOKEN> is:

$ echo -n "ABCDEFGH123456789:XNCIW93I2L1SQPJSJ823K1LS902KLDFMCZPWEO" | base64
# ABCDEFGH123456789 -- Cluster API Key
# XNCIW93I2L1SQPJSJ823K1LS902KLDFMCZPWEO -- Cluster API Secret
# https://docs.confluent.io/cloud/current/api.html#section/Authentication

There're 2 ways you could find a more descriptive message:

  1. Downgrade to 0.4.0 for the same and rerun your TF configuration (it should show API error without crashing).
  2. Use Kafka REST API directly to create a topic that represent your TF configuration.

Let us know about the results & the exact error you run into!

VipulZopSmart commented 2 years ago

@linouk23 After degrading the version to v0.4.0, got this error.

confluentcloud_kafka_topic.topics["confluent-test-topic"]: Creating...
╷
│ Error: 403 CONNECTnotallowed
│ 
│   with confluentcloud_kafka_topic.topics["confluent-test-topic"],
│   on topics.tf line 1, in resource "confluentcloud_kafka_topic" "topics":
│    1: resource "confluentcloud_kafka_topic" "topics" {
│ 
╵
Releasing state lock. This may take a few moments...
linouk23 commented 2 years ago

@VipulZopSmart could you confirm http_endpoint starts with https:// and your Kafka cluster is Basic, Standard or Dedicated Kafka cluster that is accessible over the public internet (not Private Link / VPC peering)?

VipulZopSmart commented 2 years ago

@linouk23 cluster is basic and endpoint accessible over public network. http_endpoint starts with https://.

linouk23 commented 2 years ago

Interesting, thanks for trying it out! It definitely should work for Basic Kafka cluster. Could you also try to send a curl to your http_endpoint as described ⬆️ @VipulZopSmart? as an unrelated note, you could remove a placeholder config as well:

config = {
    "cleanup.policy"    = "compact"
    "max.message.bytes" = "12345"
    "retention.ms"      = "67890"
  }
VipulZopSmart commented 2 years ago

@linouk23

'Authorization: Basic <TOKEN>' value should be kafka-secret-key, right?

linouk23 commented 2 years ago

@VipulZopSmart correct, it should be the output of the following command:

$ echo -n "ABCDEFGH123456789:XNCIW93I2L1SQPJSJ823K1LS902KLDFMCZPWEO" | base64
# in other words,  echo -n "${KAFKA_API_KEY}:${KAFKA_API_SECRET}" | base64
QUJDREVGR0gxMjM0NTY3OD... # = TOKEN => Authorization: Basic QUJDREVGR0gxMjM0NTY3OD...

where

# ABCDEFGH123456789 -- Cluster API Key
# XNCIW93I2L1SQPJSJ823K1LS902KLDFMCZPWEO -- Cluster API Secret
# https://docs.confluent.io/cloud/current/api.html#section/Authentication

image

VipulZopSmart commented 2 years ago

After hitting to the endpoint in postman, it says 401 unauthorized error in postman console.How's that possible?

It shouldn't give this error, as i've used the api key and shared key of the cluster.

linouk23 commented 2 years ago

@VipulZopSmart 401 means that the token was not accepted (e.g., there's a typo or something). Could you confirm you replaced lkc-abc123 with your cluster ID?

Could you use curl command instead of using postman?

curl --request POST \
  --url 'https://pkc-***.{region}.{cloud}.confluent.cloud:443/kafka/v3/clusters/lkc-abc123/topics' \
  --header 'Authorization: Basic <TOKEN>' \
  --header 'content-type: application/json' \
  --data '{"topic_name":"curl-test-topic","partitions_count":"2","replication_factor":"3"}'

You could also run a command to list topics to test our your token:

curl --request GET \
  --url 'https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/{cluster_id}/topics' \
  --header 'Authorization: Basic <TOKEN>'
VipulZopSmart commented 2 years ago

@linouk23 It's working fine, i got the response.

linouk23 commented 2 years ago

Thanks for confirmation @VipulZopSmart! Could you inspect / copy TF redacted logs (they may show the exact error) as the next step? Namely, you should run

export TF_LOG_PATH="./tf_logs.txt"
export TF_LOG=trace
terraform apply --auto-approve
grep -E '401|403' tf_logs.txt

And then search for 401 / 403 errors in the log file (or remove creds and send it to this email) and share your findings with us.

VipulZopSmart commented 2 years ago

There are no 401/ 403 errors in the log file.

linouk23 commented 2 years ago

@VipulZopSmart interesting, could you search for grep -i -E 'warn|error' tf_logs.txt instead and eyeball the last 100 lines of that file or something to find the failed requests?

VipulZopSmart commented 2 years ago

There are some logs related to warn|error -

2022-03-15T09:44:09.280+0530 [TRACE] providercache.fillMetaCache: error while scanning directory .terraform/providers: cannot search .terraform/providers: lstat .terraform/providers: no such file or directory
2022-03-15T09:44:09.280+0530 [TRACE] providercache.fillMetaCache: error while scanning directory .terraform/providers: cannot search .terraform/providers: lstat .terraform/providers: no such file or directory
2022-03-15T09:44:55.110+0530 [TRACE] providercache.fillMetaCache: error while scanning directory .terraform/providers: cannot search .terraform/providers: lstat .terraform/providers: no such file or directory
2022-03-15T09:44:55.110+0530 [TRACE] providercache.fillMetaCache: error while scanning directory .terraform/providers: cannot search .terraform/providers: lstat .terraform/providers: no such file or directory
linouk23 commented 2 years ago

@VipulZopSmart thanks for sending the logs! Yeah these 4 lines don't seem too relevant to your error so it might be a good idea to inspect that file a bit further (or remove credentials and send it to this email so you won't have to waste too much of your time).

VipulZopSmart commented 2 years ago

Sorry, but I cant' see this email.

linouk23 commented 2 years ago

@VipulZopSmart sure thing, let me paste it here: cflt-tf-access@confluent.io

VipulZopSmart commented 2 years ago

Issue got resolved.Thanks @linouk23 for helping out. The issue was with http_endpoint of the cluster.

linouk23 commented 2 years ago

@VipulZopSmart we're very excited to let you know we've just published a new version of TF Provider that includes a fix for this issue among other very exciting improvements: it enables fully automated provisioning of our key Kafka workflows (see the demo) with no more manual intervention and makes it our biggest and most impactful release.

The only gotcha we've renamed it from confluentinc/confluentcloud to confluentinc/confluent but we published a migration guide so it should be fairly straightforward. The existing confluentinc/confluentcloud will be deprecated soon so we'd recommend switching as soon as possible.

New confluentinc/confluent provider also includes a lot of sample configurations so you won't need to write them from scratch. You can find them here, find a full list of changes here.