confluentinc / terraform-provider-confluent

Terraform Provider for Confluent
Apache License 2.0
31 stars 64 forks source link

Expose numeric id of confluent_service_account resource #242

Closed ilian closed 1 year ago

ilian commented 1 year ago

Context

Creating a service account with Terraform exposes an alphanumeric id with the format sa-xxxxxx. Creating a new ACL through the vendor-neutral Kafka API with principal User:sa-xxxxxx succeeds. Reading ACLs through the vendor-neutral Kafka API yields ACLs with a numeric service account id with format 1234567. From this, we can infer that service account ids with format sa-* are converted by the Kafka API to numeric ids.

Issue description

This numeric id is not exposed by the terraform provider, but this information is required to avoid Terraform from assuming that the ACL has been removed when using a generic Kafka Terraform provider like Mongey/kafka. Please expose this numeric id of confluent_service_account resources.

linouk23 commented 1 year ago

👋 @ilian thanks for creating this issue!

Unfortunately we're in the process of deprecating int ID which is one of the reasons this TF Provider for Confluent doesn't store it in TF state.

You might want to take a look at Experimental Resource Importer for Confluent Terraform Provider if you're looking to import your resources to TF Provider for Confluent.

ilian commented 1 year ago

Hi @linouk23 :wave: Thanks for your quick reply. Is it planned to get rid of int IDs when creating ACLs that reference a service account? What format will likely be used in the future for referencing to the service account in an ACL principal? Will it use the User:sa-xxxxxx format instead? If so, that would solve this issue altogether as it would allow Terraform to track created ACLs without any transformations of IDs by the Kafka API of Confluent :slightly_smiling_face: Do you have a rough estimate when the int ID will be deprecated? This info would help me a lot to plan the work needed for our migration to Confluent Cloud.

linouk23 commented 1 year ago

@ilian the plan is to allow the backend (Kafka REST API) to support both intID and sa-xxxxx format.

If so, that would solve this issue altogether as it would allow Terraform to track created ACLs without any transformations of IDs by the Kafka API of Confluent 🙂

That's right!

Do you have a rough estimate when the int ID will be deprecated?

Well deprecation is a long process but instead I can tell you when Kafka REST API will start supporting sa-xxxx format: again, I can't promising anything, but it could happen in the near future (hopefully in a month or something).

ilian commented 1 year ago

Hi @linouk23 Do you have an update on whether the backend supports the User:sa-xxxxxx as a principal when creating and reading ACLs? Will the backend also return this sa-xxxxxx format or will it only provide the legacy number service account id?

ilian commented 1 year ago

I just checked the behavior on our cluster on Confluent Cloud and I confirm that the issue is still present. I just tried to create a new ACL with principal User:sa-8gpzkq but when reading it from the Kafka API, it returns User:1289320 which results in Terraform trying to re-create the ACL.

Could you let me know once this issue is addressed? Thanks :slightly_smiling_face:

linouk23 commented 1 year ago

@ilian hmm, we're in progress but we shouldn't see any changes in terraform that result in ACL recreation.

Could you share the logs / lkcID? Thank you! cc @ilian

ilian commented 1 year ago

@linouk23 Sure, here are the steps I used to reproduce the issue:

  1. Define a confluent_service_account account with the Confluent terraform provider
  2. Define a kafka_acl resource using Mongey's Terraform provider, setting acl_principal = "User:${confluent_service_account.test.id}".
  3. Run terraform apply

    
    Terraform will perform the following actions:
    
    # kafka_acl.write_topic_acl["testTopic"] will be created
    + resource "kafka_acl" "write_topic_acl" {
      + acl_host                     = "*"
      + acl_operation                = "Write"
      + acl_permission_type          = "Allow"
      + acl_principal                = "User:sa-8gpzkq"
      + id                           = (known after apply)
      + resource_name                = "testTopic"
      + resource_pattern_type_filter = "Literal"
      + resource_type                = "Topic"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

4. Observe through the Kafka API (e.g. via [Redpanda Console](https://github.com/redpanda-data/console)) that a new ACL is created. You will see that the ACL principal is of the form `User:123456`
5. Run `terraform apply` again, expecting to get no changes. Instead, I am asked to create the ACL again:

Terraform will perform the following actions:

kafka_acl.write_topic_acl["testTracker"] will be created

Plan: 1 to add, 0 to change, 0 to destroy.


I think this is because the Kafka API does not yield the same ACL that it created since it modifies the format of the principal id.
Note that I did not attempt to re-create the ACL after Terraform showed its plan to create one.
lkcId used for this test: `lkc-3rz5wm`
I hope this helps.
Please let me know if you need any additional information.
linouk23 commented 1 year ago

@ilian thanks for attaching this list!

Could you confirm you're able to observe the same issue when using confluent_kafka_acl resource too?

The thing is Mongey's TF Provider uses Kafka Admin Client whereas TF Provider for Confluent uses Kafka REST API which is a little bit different so I would recommend using ours for everything instead of avoid any compatibility issues.

ilian commented 1 year ago

@linouk23 Thanks for your quick reply. I haven't tried using the confluent_kafka_acl yet, but I would imagine that this would work since the official example uses the User:sa-xxxxxx format for the ACL principal.

As we have to deploy our topics to Kafka clusters that aren't necessarily managed by Confluent, we would like to avoid having to rewrite our modules just to support deploying our ACLs to Confluent Cloud. Is there anything planned in the near future to make the Kafka Admin Client receive the proper sa-xxxxxx principal ID format when querying ACLs to make the migration to Confluent Cloud easier for us? I can imagine that a fix for this would also make it easier to figure out which service account is associated with a given ACL principal when using third party tooling like Redpanda Console.

linouk23 commented 1 year ago

That's a great question! Could you expand about your use case:

As we have to deploy our topics to Kafka clusters that aren't necessarily managed by Confluent, we would like to avoid having to rewrite our modules just to support deploying our ACLs to Confluent Cloud.

As it is right now, I'd probably have to say that our TF Provider is mostly suitable for managing Kafka clusters with REST API enabled since there was a design decision to use REST API and not Kafka Admin Client (the idea is REST API uses https and has got a ton of nice features).

linouk23 commented 1 year ago

cc @ilian

ilian commented 1 year ago

That's a great question! Could you expand about your use case:

As we have to deploy our topics to Kafka clusters that aren't necessarily managed by Confluent, we would like to avoid having to rewrite our modules just to support deploying our ACLs to Confluent Cloud.

We currently have an internal module developed that uses the vendor-neutral API of Kafka by using kafka_acl resource of Mongey's Terraform provider. This module communicates directly to the Kafka broker without any SaaS-specific APIs. We would like to avoid having to maintain a separate module to set up the ACLs for Confluent Cloud to keep our module structure simple on our end. The only reason why we would need to rewrite our configuration is the fact that there is a difference in the principal id format when writing and reading ACLs. It would be very helpful if the service account id format and principal id format are consistent. This could be achieved by implementing either of the following:

  1. The Terraform provider would expose the numeric id of the service account (the same format when querying via the AdminClient API)
  2. Let Confluent Cloud's AdminClient API expose the principal id with format User:sa-xxxxxx.

    I can imagine that this would also be valuable for new customers looking to migrate from other Kafka services to Confluent Cloud. Please let me know if either option would be implemented in the near future.

As it is right now, I'd probably have to say that our TF Provider is mostly suitable for managing Kafka clusters with REST API enabled since there was a design decision to use REST API and not Kafka Admin Client (the idea is REST API uses https and has got a ton of nice features).

Yes, I understand that Confluent's Terraform provider provides additional features by calling the REST API. We are currently interested in migrating our current configuration that only depends on the AdminClient API.

linouk23 commented 1 year ago

Thanks for sharing all these details!

The Terraform provider would expose the numeric id of the service account (the same format when querying via the AdminClient API)

Let Confluent Cloud's AdminClient API expose the principal id with format User:sa-xxxxxx.

This might be great long term solutions but I'm not sure whether they'll be added in the near future.

linouk23 commented 1 year ago

I can imagine that this would also be valuable for new customers looking to migrate from other Kafka services to Confluent Cloud.

That's a great idea! We might add a separate script that'd be able to generate a TF config based on existing Kafka deployment. It'd be use Kafka Admin Client to fetch ACLs / topics and then generate a TF config based on that data.