Closed komal-SkyNET closed 2 years ago
@komal-SkyNET thanks for reporting the issue!
We're thinking about supporting a configuration where these attributes are moved to a providers
block (alongside credentials
block), for example:
provider "confluent" {
cloud_api_key = var.confluent_cloud_api_key
cloud_api_secret = var.confluent_cloud_api_secret
kafka_http_endpoint = ...
kafka_api_key = ...
kafka_api_secret = ...
}
such that a user won't need to provide http_endpoint
and credentials
for confluent_kafka_topic
, confluent_kafka_acl
:
# before
resource "confluent_kafka_topic" "orders" {
kafka_cluster {
id = confluent_kafka_cluster.basic-cluster.id
}
topic_name = "orders"
partitions_count = 4
http_endpoint = confluent_kafka_cluster.basic-cluster.http_endpoint
config = {
"cleanup.policy" = "compact"
"max.message.bytes" = "12345"
"retention.ms" = "67890"
}
credentials {
key = confluent_api_key.app-manager-kafka-api-key.id
secret = confluent_api_key.app-manager-kafka-api-key.secret
}
}
# after
resource "confluent_kafka_topic" "orders" {
kafka_cluster {
id = confluent_kafka_cluster.basic-cluster.id
}
topic_name = "orders"
partitions_count = 4
config = {
"cleanup.policy" = "compact"
"max.message.bytes" = "12345"
"retention.ms" = "67890"
}
}
Do you think it'd be useful?
@linouk23 as you have just introduced a fix to make them available as they are - and customers are now using the code as it is, moving them and making what would be a breaking change - could I suggest that we have a period of stability concentrating on other issues eg schema registry before changing what does now work - (but may not be in that region yet)
@Marcus-James-Adams no worries, we'll continue to support current configurations too!
@linouk23 Moving these keys to the provider block definitely makes sense. With respect to the naming, my suggestion would be to call kafka_api_key
kafka_api_secret
differently, perhaps admin_api_key
or manager_api_key
to give it more context and distinction.
The main issue brought up in here however is the bug wherein resources incorrectly expect http_endpoint
as mandatory parameter, eg;confluent_kafka_acl
resource
With respect to the code block refactoring, I agree with @Marcus-James-Adams that features such support for schema registry are a higher priority (we need it too). But this should not discount the defect mentioned in point 2.
@komal-SkyNET thanks for the comment!
sr_api_key
/ ksql_api_key
) so admin_api_key
might not have great context for new users.Let me know if that makes sense!
Thanks.
admin
or master
or manager
to reflect its higher privilege) ex: admin_sr_api_key
. Otherwise, it's a bit ambiguous what keys these are referring to since a user could have multiple api_key resources in code for these very same objects. I see, well, theoretically, a user could pass there any kind of Kafka API Key -- even without any kind of privileges so calling it an admin might be a little bit ambitious.
@linouk23 Right, my understanding was that these were intended to be privileged keys. Quite sure I read something along those lines in the doco too (will try to find that). The way I understood it: a user could create multiple confluent_service_account
and confluent_api_key
s (say in DeveloperRead
level) in their TF code but would ideally supply the higher privileged key for creating resources such as acls, service_accounts, etc.
@komal-SkyNET thanks for reporting the issue!
We're thinking about supporting a configuration where these attributes are moved to a
providers
block (alongsidecredentials
block), for example:provider "confluent" { cloud_api_key = var.confluent_cloud_api_key cloud_api_secret = var.confluent_cloud_api_secret kafka_http_endpoint = ... kafka_api_key = ... kafka_api_secret = ... }
This I think introduces a loop when someone wants to create a cluster, admin and topic in one terraform module as you cannot have the kafka_api_key defined until you create the cluster and the admin user and the key. I think a clearer option would be go the way databricks did with two different providers :
provider "confluent" {
alias = "kafka"
kafka_http_endpoint = confluent_kafka_cluster_v2.standard.rest_endpoint
kafka_api_key = confluent_api_key_v2.app-manager-kafka-api-key.id
kafka_api_secret = confluent_api_key_v2.app-manager-kafka-api-key.secret
}
here cloud* would be optional and you can have a separate terraform module that does not have access to cloud* credentials and still work. The example above will not work as Terraform cannot evaluate data at the time it instantiates provider but those are the values. Then you use that provider for kafka resources:
resource "confluent_kafka_topic" "orders" {
provider = confluent.kafka
kafka_cluster {
id = confluent_kafka_cluster.basic-cluster.id
}
topic_name = "orders"
partitions_count = 4
config = {
"cleanup.policy" = "compact"
"max.message.bytes" = "12345"
"retention.ms" = "67890"
}
}
Do you think it'd be useful?
@muresan thanks for such a detailed reply!
Could you elaborate on
I think a clearer option would be go the way databricks did with two different providers
I can find just one Databricks provider: https://registry.terraform.io/namespaces/databrickslabs
Regarding
provider "confluent" {
alias = "kafka"
kafka_http_endpoint = confluent_kafka_cluster_v2.standard.rest_endpoint
kafka_api_key = confluent_api_key_v2.app-manager-kafka-api-key.id
kafka_api_secret = confluent_api_key_v2.app-manager-kafka-api-key.secret
}
that's actually a really neat idea. It might not work right now because of some technical limitations (we do a hacky conversion for service account IDs for confluent_kafka_acl_v3.principal
-> service account int ID that our Kafka REST API supports by calling Cloud API (so Cloud API Key is required for now).
We did made a design decision to have a unified TF provider (so that users won't have to fix versioning issues) but having a oneOf for (Kafka creds / Cloud creds) sounds very reasonable!
his I think introduces a loop when someone wants to create a cluster, admin and topic in one terraform module as you cannot have the kafka_api_key defined until you create the cluster and the admin user and the key.
What do you think about our newest examples:
@muresan thanks for such a detailed reply!
Could you elaborate on
I think a clearer option would be go the way databricks did with two different providers
I can find just one Databricks provider: https://registry.terraform.io/namespaces/databrickslabs
Yes that's the one. If you look in any of the examples you can see that the databricks_mws_*
resources have provider = databricks.mws
set. Similar to how you offer an API to manage Kafka clusters and another API to manage One Kafka cluster, Databricks have one API to manage workspaces and another API to manage One workspace. Multiple "personalities" on the same provider would help by avoiding to split in multiple providers (like for example Elastic Cloud has ec to manage Elastic Cloud Elasticsearch clusters and elasticstack to manage individual clusters.)
Regarding
provider "confluent" { alias = "kafka" kafka_http_endpoint = confluent_kafka_cluster_v2.standard.rest_endpoint kafka_api_key = confluent_api_key_v2.app-manager-kafka-api-key.id kafka_api_secret = confluent_api_key_v2.app-manager-kafka-api-key.secret }
that's actually a really neat idea. It might not work right now because of some technical limitations (we do a hacky conversion for service account IDs for
confluent_kafka_acl_v3.principal
-> service account int ID that our Kafka REST API supports by calling Cloud API (so Cloud API Key is required for now).We did made a design decision to have a unified TF provider (so that users won't have to fix versioning issues) but having a oneOf for (Kafka creds / Cloud creds) sounds very reasonable!
his I think introduces a loop when someone wants to create a cluster, admin and topic in one terraform module as you cannot have the kafka_api_key defined until you create the cluster and the admin user and the key.
What do you think about our newest examples:
Those examples fit cases with two teams, there are cases with three tiers as well: networking/, backing services/kafka admin, application/product team. I know it's more complicated but think also from the lifecycle of various resources:
I think in enterprise most of the time there will be a networking team, a services team and an application team that each need specific access to parts of this setup. Networking team to provision the VPC resources (in Confluent and other clouds) and peerings, Services team to bring up the Kafka cluster and manage it, create accounts and ACLs, and Application team to connect to the cluster and manage resources from the cluster (topics, cluster accounts, auth, etc). And this also means 3 different deploy pipelines and life-cycles with 3 different access levels. Whatever can be done to support such a configuration would be best.
@muresan
Similar to how you offer an API to manage Kafka clusters and another API to manage One Kafka cluster, Databricks have one API to manage workspaces and another API to manage One workspace.
That's super actually insightful! I wonder what's more convenient:
# DB API #1 (one workspace)?
resource "databricks_mws_private_access_settings" "pas" {
provider = databricks.mws
account_id = var.databricks_account_id
private_access_settings_name = "Private Access Settings for ${local.prefix}"
region = var.region
public_access_enabled = true
}
# DB API #2 (many workspaces)?
resource "databricks_cluster" "shared_autoscaling" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 50
}
}
vs
provider "confluent" {
alias = "one"
kafka_rest_endpoint = "..."
kafka_api_key = "..."
kafka_api_secret = "..."
}
resource "confluent_kafka_topic" "orders" {
provider = confluent.one
kafka_cluster {
id = confluent_kafka_cluster.basic-cluster.id
}
topic_name = "orders"
partitions_count = 4
config = {
"cleanup.policy" = "compact"
"max.message.bytes" = "12345"
"retention.ms" = "67890"
}
}
# and
provider "confluent" {
alias = "many"
cloud_api_key = "..."
cloud_api_secret = "..."
}
resource "confluent_environment_v2" "prod" {
provider = confluent.many
display_name = "Production"
}
Multiple "personalities" on the same provider would help by avoiding to split in multiple providers (like for example Elastic Cloud has ec to manage Elastic Cloud Elasticsearch clusters and elasticstack to manage individual clusters.)
I don't think we'll follow their path (the biggest reason was multiple customers mentioned potential versioning issues) but is that something you'd prefer over aliasing?
I think in enterprise most of the time there will be a networking team, a services team and an application team that each need specific access to parts of this setup.
That's super helpful to hear! We'll try to come up with a few more examples for a setup with 3 teams that would feature NetworkAdmin
role.
Those examples fit cases with two teams, there are cases with three tiers as well: networking/, backing services/kafka admin, application/product team. I know it's more complicated but think also from the lifecycle of various resources: longest living: vpc peering/environments next in order: kafka cluster/admin credentials next in order: kafka topics/credentials I think in enterprise most of the time there will be a networking team, a services team and an application team that each need specific access to parts of this setup. Networking team to provision the VPC resources (in Confluent and other clouds) and peerings, Services team to bring up the Kafka cluster and manage it, create accounts and ACLs, and Application team to connect to the cluster and manage resources from the cluster (topics, cluster accounts, auth, etc). And this also means 3 different deploy pipelines and life-cycles with 3 different access levels. Whatever can be done to support such a configuration would be best.
@muresan by the way, could you list the expected roles for each team? For example, it looks like
OrganizationAdmin
role)NetworkAdmin
role)EnvironmentAdmin
role)CloudClusterAdmin
role)@muresan by the way, could you list the expected roles for each team? For example, it looks like
- Kafka Ops team (manage environments --
OrganizationAdmin
role)- Networking team (manage networks across all environments --
NetworkAdmin
role)- Services team (manage a specific environment --
EnvironmentAdmin
role)- Application team (manage a specific cluster --
CloudClusterAdmin
role)
In my example Kafka Ops and Services team are one team, they engage with the vendor (Confluent) and manage organization and approve Kafka cluster creation, etc. then pass on credentials to Application/product team and Networking team. The Networking team only cares about the connectivity between the services, in our specific case GCP peering/routes/DNS (not needed for GCP since private service connect is not supported). I think 4 teams is not realistic.
@muresan
Similar to how you offer an API to manage Kafka clusters and another API to manage One Kafka cluster, Databricks have one API to manage workspaces and another API to manage One workspace.
That's super actually insightful! I wonder what's more convenient:
# DB API #1 (one workspace)? resource "databricks_mws_private_access_settings" "pas" { provider = databricks.mws account_id = var.databricks_account_id private_access_settings_name = "Private Access Settings for ${local.prefix}" region = var.region public_access_enabled = true } # DB API #2 (many workspaces)? resource "databricks_cluster" "shared_autoscaling" { cluster_name = "Shared Autoscaling" spark_version = data.databricks_spark_version.latest_lts.id node_type_id = data.databricks_node_type.smallest.id autotermination_minutes = 20 autoscale { min_workers = 1 max_workers = 50 } } vs provider "confluent" { alias = "one" kafka_rest_endpoint = "..." kafka_api_key = "..." kafka_api_secret = "..." } resource "confluent_kafka_topic" "orders" { provider = confluent.one kafka_cluster { id = confluent_kafka_cluster.basic-cluster.id } topic_name = "orders" partitions_count = 4 config = { "cleanup.policy" = "compact" "max.message.bytes" = "12345" "retention.ms" = "67890" } } # and provider "confluent" { alias = "many" cloud_api_key = "..." cloud_api_secret = "..." } resource "confluent_environment_v2" "prod" { provider = confluent.many display_name = "Production" }
It's one option and in the end it depends on how easy it is to implement and maintain and how well Terraform supports this.
Multiple "personalities" on the same provider would help by avoiding to split in multiple providers (like for example Elastic Cloud has ec to manage Elastic Cloud Elasticsearch clusters and elasticstack to manage individual clusters.)
I don't think we'll follow their path (the biggest reason was multiple customers mentioned potential versioning issues) but is that something you'd prefer over aliasing?
I don't have a specific preference, it's up to Confluent to support the provider, if there is a single API endpoint for both Confluent resources and Kafka resources then it makes sense to go the single provider route with "multiple personalities".
I think it makes sense for Elastic to split because you can use the elasticstack
provider to manage an Elasticsearch instance which is not running in Elastic Cloud, you can manage an on premises instance. I did not go deep enough in Confluent docs to find out if this is the case or not with Confluent Cloud Kafka clusters and non-cloud Confluent Kafka clusters or even pure Apache Kafka clusters.
I think in enterprise most of the time there will be a networking team, a services team and an application team that each need specific access to parts of this setup.
That's super helpful to hear! We'll try to come up with a few more examples for a setup with 3 teams that would feature
NetworkAdmin
role.
@komal-SkyNET speaking about the original issue, we continued the conversation in https://github.com/confluentinc/terraform-provider-confluent/issues/14#issuecomment-1157141796
closing the issue since our backend team rolled out a fix for
│ Error: error fetching Kafka Cluster "abc-cluster"'s "http_endpoint" attribute: http_endpoint is nil or empty for Kafka Cluster "abc-cluster" │ 8: resource "confluent_api_key" "cluster-api-key" {
Also check out our latest 1.0.0
release
The Confluent Terraform Provider is now Generally Available and recommended for use in production workflows.
Now the bug is fixed and rolling through the regions why break what's working?
At the moment
I can create, cluster, accounts, topics in a single layer of code with no complexity.
This provider has been a long time coming, but now it's here it's working good
Don't change a thing on this side of things finish rolling the fix for the http bug through the regions
On Fri, 17 Jun 2022, 19:09 Catalin Muresan, @.***> wrote:
@komal-SkyNET https://github.com/komal-SkyNET thanks for reporting the issue!
We're thinking about supporting a configuration where these attributes are moved to a providers block (alongside credentials block), for example:
provider "confluent" { cloud_api_key = var.confluent_cloud_api_key cloud_api_secret = var.confluent_cloud_api_secret kafka_http_endpoint = ... kafka_api_key = ... kafka_api_secret = ... }
This I think introduces a loop when someone wants to create a cluster, admin and topic in one terraform module as you cannot have the kafka_api_key defined until you create the cluster and the admin user and the key. I think a clearer option would be go the way databricks did with two different providers :
provider "confluent" { alias = "kafka" kafka_http_endpoint = confluent_kafka_cluster_v2.standard.rest_endpoint kafka_api_key = confluent_api_key_v2.app-manager-kafka-api-key.id kafka_api_secret = confluent_api_key_v2.app-manager-kafka-api-key.secret }
here cloud* would be optional and you can have a separate terraform module that does not have access to cloud* credentials and still work, and then you use that provider for kafka resources:
resource "confluent_kafka_topic" "orders" { provider = confluent.kafka kafka_cluster { id = confluent_kafka_cluster.basic-cluster.id } topic_name = "orders" partitions_count = 4 config = { "cleanup.policy" = "compact" "max.message.bytes" = "12345" "retention.ms" = "67890" } }
Do you think it'd be useful?
— Reply to this email directly, view it on GitHub https://github.com/confluentinc/terraform-provider-confluent/issues/49#issuecomment-1159120492, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7GYFZ4OX2E6ZZAICAXQULVPS5OZANCNFSM5YMTATHA . You are receiving this because you were mentioned.Message ID: @.*** com>
@Marcus-James-Adams sure thing, we don't want to break anything!
For the context, we did add support for both scenarios:
# Option #1 when managing multiple clusters in the same Terraform workspace
provider "confluent" {
cloud_api_key = var.confluent_cloud_api_key # optionally use CONFLUENT_CLOUD_API_KEY env var
cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
}
# Option #2 when managing a single cluster in the same Terraform workspace
# See https://github.com/confluentinc/terraform-provider-confluent/tree/master/examples/configurations/managing-single-cluster for more details
provider "confluent" {
cloud_api_key = var.confluent_cloud_api_key # optionally use CONFLUENT_CLOUD_API_KEY env var
cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
kafka_rest_endpoint = var.kafka_rest_endpoint # optionally use KAFKA_REST_ENDPOINT env var
kafka_api_key = var.kafka_api_key # optionally use KAFKA_API_KEY env var
kafka_api_secret = var.kafka_api_secret # optionally use KAFKA_API_SECRET env var
}
as described in our docs.
Cluster's REST
http_endpoint
is incorrectly expected as a required variable for several resources, thereby deterring usability. Examples:confluent_api_key
,confluent_kafka_acl
From TF Docs:
From API Docs: