Closed bluedog13 closed 2 years ago
@bluedog13 what version of the Provider are you using?
Can you also share the format of this Service Account ID:
principal = "User:${var.mynamespace-myproducerapp-sa_id}"
Is it a series of numbers, or is it SA-1234 format?
Can you also try removing this line and seeing if it will work? host = "*"
Thank you for the quick reply.
Below are the details of the provider I am using
terraform {
required_providers {
confluentcloud = {
source = "confluentinc/confluentcloud"
version = "0.5.0"
}
}
}
The principal is of the format (this is a change from v0.3.0)
+ principal = "User:sa-yo8***"
I still get the 401 error after making the below changes
Took off the host part
#host = "*"
Can you email me your Org ID at sshumway@confluent.io. Instructions for finding your Org ID here: https://docs.confluent.io/cloud/current/client-apps/cloud-basics.html#view-organization-bill-and-id
Also can you follow these steps to see if the TF Logs have any useful info. Email me what you find here as well: Enable debug logging in Terraform. Debug logging will give you detailed error messages that you can use to investigate issues. Use these commands to enable debug logging: export TF_LOG_PATH="./full_log.txt" export TF_LOG=debug terraform apply --auto-approve You should then open the log file in TF_LOG_PATH path to see the full logs.
I have sent the ord id to the above email. I'll share the logs once I set them up. Thanks.
PS : I am using "basic" cluster to create the topics and ACL's. This worked in v0.3.0 previously.
@bluedog13 just to double check, could you share the role / ACLs of the service account that owns a corresponding Kafka API Key (I know that you'd likely see 403
error instead of 401
if that's the case but still):
# ACLs:
confluent kafka acl list --service-account sa-1a2b3c
# Roles:
# https://confluent.cloud/settings/org/assignments
The service account was created using "confluentcloud_service_account" resource and has been manually assigned a key/secret that is restricted to the environment:cluster the topic resides in.
$ ccloud api-key create \
--resource lkc-**** \ # topic cluster
--environment env-**** \ # topic environment
--service-account sa-yo8***
The ACL details are as follows.
$ ccloud kafka acl list --service-account sa-yo8***j
UserId | ServiceAccountId | Permission | Operation | Resource | Name | Type
---------+------------------+------------+-----------+----------+------+-------
The generated API key details are as follows
$ ccloud api-key list -o json | jq '.[] | select(.key == "7ZXHHAPNE*********")'
{
"created": "2022-03-**T02:03:56Z",
"description": "",
"key": "7ZXHHAPNE*******",
"owner": "561****",
"owner_email": "<service account>",
"owner_resource_id": "sa-yo8***",
"resource_id": "lkc-****",
"resource_type": "kafka"
}
The service account details are as follows
$ ccloud service-account list -o json | jq '.[] | select(.resource_id == "sa-yo8**")'
{
"description": "Service account for `mynamespace` namespace and `myapp` application",
"id": "561***",
"name": "mynamespace-myproducerapp-sa",
"resource_id": "sa-yo8***"
}
Thanks for a quick reply and attaching all the commands you run!
Could you also open https://confluent.cloud/settings/org/assignments and confirm that sa-yo8**
SA has a CloudClusterAdmin
, EnvironmentAdmin
, or OrganizationAdmin
role in the target environment?
The idea is Kafka API Key infers the permission of the owner ("sa-y****"
) and it seems like it got none ACLs attached to it and it might be the case that it has no role assigned in the target environment so 401
essentially means 403
.
For the reference, here's our sample configuration file. You can see there's a app-manager
with CloudClusterAdmin
role that creates topics & ACLs.
Luckily, we'll release api_key
resource with corresponding examples soon soon so this problem will disappear.
The account "sa-yo8**" has no role assigned in the target environment.
The question now I have is - I created the service account using the below. The part I have missed by comparing with the the shared configuration file is the "confluentcloud_role_binding" part to the service account. Could this be my issue?
resource "confluentcloud_service_account" "mynamespace-myproducerapp-sa" {
display_name = var.mynamespace-myproducerapp-sa
description = "Service account for `mynamespace` namespace and `myapp` application"
}
Does this mean, it's not sufficient to create a service account but one also has to assign it a role to the required cluster before granting ACL? (Is this a change from v0.3.0?)
Also, if I need to use the service account across all environments (dev/test/uat/prod), do I need to assign multiple role-binding to the service account, one for each environment?
Lastly, I am bit confused as to the purpose of the API key/secret being assigned to the service account, because this does not seem to help if the service account does not have the right role-binding - even though the key/secret was assigned to a given environment-cluster. Service accounts are not environment specific from what I understand.
Thanks.
Could this be my issue?
That's a great idea to create that role binding and try to recreate ACLs that was failing with 401
before. If that does fix 401
-- let us know and we'll answer your other questions from the comment once we figure out the exact issue.
I was able to resolve this without adding role-binding part to the service account.
The resolution was - the cloud key/secret is required somewhere (not sure), even though we specify the "cluster_api_key" and "cluster_api_secret" while provisioning the topic and ACL's. In my solution setup, I have a seperate folder to create topics (using work spaces for different environments)
Adding the api_key and api_secret in the below 2 lines fixed it
provider "confluentcloud" {
api_key = var.cloud_api_key.
api_secret = var.cloud_api_secret
}
Are the cloud keys required to map to the org or what's their purpose during provisioning topics and ACL's in a given cluster?
Thanks.
That's great to hear!
Are the cloud keys required to map to the org or what's their purpose during provisioning topics and ACL's in a given cluster?
They are required for ACLs and the reason why is TF Provider is doing service account ID conversion behind the scenes (sa-abc123
-> 56789
) by using an API that requires a token generated from Cloud API Key.
Thanks for the clarification. I believe this is the change in v0.4.0/v0.5.0 where "sa-****" format is being used for ACL and not integer values like in earlier versions.
This also makes sense as to why it worked in v0.3.0 and earlier without cloud key.
Thanks for figuring it out on your own @bluedog13, great job!
Nice work @bluedog13! Thanks for helping @linouk23!
@bluedog13 would you be willing to share more about how you structure your TF workspace? I'm wondering if we will need to call this out somewhere in our docs for people not doing monolithic deployments.
Sure, I can share the structure. I am also using workspaces to map to different environments for clusters and topics. Service accounts is tricky since they are not tied to environment.
.
├── README.md
└── resources
├── 1_kafka-cluster
│ ├── README.md
│ ├── clusters.tf
│ ├── images
│ │ └── cluster-provisioning.png
│ ├── main.tf
│ ├── non-prod
│ │ └── terraform.tfvars
│ ├── outputs.tf
│ ├── prod
│ │ └── terraform.tfvars
│ ├── sandbox
│ │ └── terraform.tfvars
│ ├── terraform.tfstate
│ ├── terraform.tfstate.d
│ │ ├── nonprod
│ │ ├── prod
│ │ └── sandbox
│ │ ├── terraform.tfstate
│ │ └── terraform.tfstate.backup
│ ├── tfplan_create_cluster_sandbox
│ └── variables.tf
├── 2_service-accounts
│ ├── 0.example
│ │ ├── accounts.tf
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── terraform.tfvars
│ │ └── variables.tf
│ ├── README.md
│ ├── namespace1
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── accounts.tf
│ │ ├── terraform.tfvars
│ │ └── variables.tf
│ ├── images
│ │ └── service-accounts-provisioning.png
│ ├── namespace2
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── accounts.tf
│ │ ├── terraform copy.tfvars
│ │ ├── terraform.tfvars
│ │ └── variables.tf
│ ├── sce
│ └── vdi
└── 3_kafka-topics
├── README.md
├── dev
│ └── terraform.tfvars
├── full_log.txt
├── images
│ └── topic-provisioning.png
├── main.tf
├── outputs.tf
├── prod
│ └── terraform.tfvars
├── terraform.tfstate
├── terraform.tfstate.d
│ ├── dev
│ │ ├── terraform.tfstate
│ │ └── terraform.tfstate.backup
│ ├── prod
│ ├── test
│ │ ├── terraform.tfstate
│ │ └── terraform.tfstate.backup
│ └── uat
│ └── terraform.tfstate
├── test
│ └── terraform.tfvars
├── tfplan_create_topic_sandbox_dev
├── tfplan_create_topic_sandbox_test
├── tfplan_create_topic_sandbox_uat
├── topic-sample.tf
├── uat
│ └── terraform.tfvars
├── variables-sample-topic.tf
└── variables.tf
However when I try to add ACL to the topic, for the created service account using the same cluster key, I get the below error
Below is the setup
What is causing the "401" error during the creation of ACL when the cluster is the same and the topic was provisioned using the same cluster keys?
Much appreciated