Open thelateperseus opened 1 month ago
What are the SQS configuration that you applied? did you use the cloudformation template provider in our getting started guide? https://github.com/aws/karpenter-provider-aws/blob/main/website/content/en/v1.0/getting-started/getting-started-with-karpenter/cloudformation.yaml
We use Terraform as our IaC tool of choice, so I didn't use the CloudFormation template directly. However, I did translate the CloudFormation template into Terraform HCL. Below is the old configuration which did not work:
# SQS queue used to receive EC2 spot interruption notices
resource "aws_sqs_queue" "karpenter_interruption" {
name = "karpenter-interruption-${aws_eks_cluster.test.name}"
message_retention_seconds = 300
sqs_managed_sse_enabled = true
}
data "aws_iam_policy_document" "karpenter_interruption_queue" {
statement {
sid = "SendEventsToQueue"
actions = ["sqs:SendMessage"]
principals {
type = "Service"
identifiers = [
"events.amazonaws.com",
"sqs.amazonaws.com",
]
}
}
statement {
sid = "DenyHTTP"
effect = "Deny"
actions = ["sqs:*"]
resources = [aws_sqs_queue.karpenter_interruption.arn]
principals {
type = "AWS"
identifiers = ["*"]
}
condition {
test = "Bool"
variable = "aws:SecureTransport"
values = [false]
}
}
}
resource "aws_sqs_queue_policy" "karpenter_interruption" {
queue_url = aws_sqs_queue.karpenter_interruption.id
policy = data.aws_iam_policy_document.karpenter_interruption_queue.json
}
Removing sqs_managed_sse_enabled
on the aws_sqs_queue
and adding kms_master_key_id
with a customer-managed key resolved the issue (along with the appropriate IAM permissions as per the docs I linked above).
Description
How can the docs be improved?
SQS queue permissions The CloudFormation sample in the docs includes an SQS queue with an Amazon-managed KMS key for server-side encryption. However, I believe that the SQS queue needs to use a customer-managed KMS key instead.
From the "My events are not delivered to the target Amazon SQS queue" topic in the EventBridge Troubleshooting documentation:
The referenced Configuring AWS KMS permissions page also describes additional permissions for the receiver (Karpenter IAM role).
When using an Amazon-managed key, the FailedInvocations graph exactly matched the Invocations graph for my EventBridge rule. Likewise, the Number Of Messages Received graph was always zero for the SQS queue. After switching to a customer-managed KMS key and updating the IAM permissions as documented, the FailedInvocations for EventBridge rule disappeared, and the SQS Number Of Messages Received graph shows the expected message counts.
Is this an error with Karpenter's CloudFormation example, or am I doing something wrong?