Open DanaLacoste opened 3 years ago
Added as a feature request because I totally get what's happening here: Terraform is storing the value in the state file (can be in outputs, too, after all, if you add that) and AWS returns a different value every time.
So what I am looking for is an ehancement to, well, silence that output, if possible.
Ran across the same issue last week. It seems AWS returns 3 brokers by random (from each AZ) when calling get-bootstrap-brokers
. This is problematic when the cluster has multiple nodes/brokers in each AZs as the values change between plans/applies. In our case we have a 6 broker cluster across three AZ's so 2 brokers in each AZ.
In addition to ignoring the changes could we consider providing an attribute for the broker hostnames so we can list all node endpoints?
It looks like the CLI call that list-nodes
makes achieves this. The underlying Rest endpoint is:
https://docs.aws.amazon.com/msk/1.0/apireference/clusters-clusterarn-nodes.html
In our case we want to fetch all the broker host names so we can setup Consul entries for each.
Additional Info:
# Initial check of Brokers
aws kafka get-bootstrap-brokers --cluster-arn <redacted>
{
"BootstrapBrokerString": "b-6.<redacted>.kafka.eu-west-1.amazonaws.com:9092,b-2.<redacted>.eu-west-1.amazonaws.com:9092,b-4<redacted>eu-west-1.amazonaws.com:9092",
"BootstrapBrokerStringTls": "b-6<redacted>.eu-west-1.amazonaws.com:9094,b-2.<redacted>.eu-west-1.amazonaws.com:9094,b-4.<redacted>.eu-west-1.amazonaws.com:9094"
}
# Second check (30 seconds later):
β― aws kafka get-bootstrap-brokers --cluster-arn <redacted>
"BootstrapBrokerString": "b-1.<redacted>.eu-west-1.amazonaws.com:9092,b-5.<redacted>.eu-west-1.amazonaws.com:9092,b-3.<redacted>.eu-west-1.amazonaws.com:9092",
"BootstrapBrokerStringTls": "b-1.<redacted>.eu-west-1.amazonaws.com:9094,b-5.<redacted>.eu-west-1.amazonaws.com:9094,b-3.<redacted>.eu-west-1.amazonaws.com:9094"
}
It appears this is (partially) documented behavior. Quoting AWS Docs:
A list of brokers that a client can use to bootstrap. This list doesn't necessarily include all of the brokers in the cluster.
See: https://docs.aws.amazon.com/msk/1.0/apireference/clusters-clusterarn-bootstrap-brokers.html
I cant speak to the need for the diff to not show up in an apply. However, this undeterministic nature does cause problems when this parameter(boostrapbroker*) is being passed to another resource; eg. aws_ssm_parameter. Applies can often fail on our >3 node clusters with the following
β·
β Error: Provider produced inconsistent final plan
β
β When expanding the plan for aws_ssm_parameter.bootstrap_servers_param[0] to
β include new values learned so far during apply, provider
β "registry.terraform.io/hashicorp/aws" produced an invalid new value for
β .value: inconsistent values for sensitive attribute.
β
β This is a bug in the provider, which should be reported in the provider's
β own issue tracker.
β΅
Releasing state lock. This may take a few moments...
ERRO[0017] 1 error occurred:
* exit status 1
We have to hope that the api calls made to retrieve this value during the course of the plan and apply process return the same result.
Is there a known way to fix this situation? This seems to be more of an API issue, can it be flagged somewhere?
Ideally one should be able to either get all the broker connect strings, in my set up I have 4 nodes across 2 AZ and I always get 3 results back that randomly shuffle every time there is an apply. I run multiple connectors on MSK connect and the change on the bootstrap_brokers causes a replacement of the connectors.
@DanaLacoste According to AWS any broker carries the required metadata to be a bootstrap node.
Any broker can be bootstrap server because every broker receives metadata information.
Furthermore, the bootstrap string returned from the cli call invoked by the aws_msk_cluster
terraform resource returns the ever-changing aws generated bootstrap string that contains 3 brokers across AZs in which the MSK cluster is deployed (unless only two brokers are available or less AZs are selected).
@lplazas
In order to avoid the ever-changing bootstrap broker string, you could generate yours from the brokers and use all the brokers (or a subset) in it through the data "aws_msk_broker_nodes"
terraform resource that invokes a sorted by node id result of the list-nodes cli command, as @james-bjss was pointing out.
data "aws_msk_broker_nodes" "nodes" {
cluster_arn = aws_msk_cluster.kafka.arn
}
output "kafka_sasl_scram_connection_string" {
description = "Connection host:port pairs"
value = join(",", [for endpoint in flatten(data.aws_msk_broker_nodes.nodes.node_info_list.*.endpoints) : format("%s:9096", endpoint)])
}
Then use the obtained output.
Pfrr, still an open issue?
We'd appreciate to have an reproducible output for it in our company, but the work around ^^ works
Community Note
Description
NOTE: This is more than a cosmetic issue, but less than a "it breaks things" issue. Terraform is doing the right thing on apply, but it is outputting something extraneous which has caused confusion while trying to investigate an unrelated change.
When applying on an MSK cluster which has more nodes than availability zones, the diff will show that
Objects have changed outside of Terraform
and report thatbootstrap_brokers_sasl_scram
(or _iam, or....) has changed, with a list of nodes. This is caused by the AWS SDK call returning a random list of nodes, one per AZ (so if you have one node per AZ, you will get the same list every time, but if you have multiple nodes in any AZ, then you cannot predict which nodes will return)This request is to silence that output (somehow?). I have tried adding
ignore_changes = [bootstrap_brokers_sasl_scram]
to no effect: the change is already (correctly) being ignored as far as "should terraform re-apply this resource to fix the change?" after all, it is only identifying the property of the resource (as returned by the AWS API call) as changed.NOTE 1: This is related to https://github.com/hashicorp/terraform-provider-aws/pull/17579 but is a different issue: this is not about order of the results, but content of the (ordered) results.
NOTE 2: This does not occur if you have number of nodes <= number of availability zones (i.e. it only happens if you set up a bigger cluster to handle a larger load.)
Current Output
Desired Output (reduced for clarity)
New or Affected Resource(s)
Potential Terraform Configuration
This is a portion of our config (the relevant part). I was hoping the lifecycle rule would fix the issue, but it did not.
References