Open MattPumphrey opened 4 years ago
Having a bit of trouble seeing what the actual issue is. Given the title, it seems to just sit there doing nothing?
Thats generally correct, on this one cluster we have 95 topics. When doing the plan, it connects and then is unable to close causing any plan to fail with the automation we are using, mainly Atlantis. But even when testing this, I can get through our Dev and Preprod environment, and this might be a broker issue. I am going to be testing this again on Friday to see if it has resolved that issue. But the issue that it couldnt close the connection or at least timeout after a certain time or error out causes an issue.
Looks like I have this same issue with ACLs. We have 455 ACL entries and the provider (v0.2.2 and v0.2.3) is consistently hanging when trying to run terraform plan
with terraform 0.12.16 or 0.12.21 on linux amd64 in a gitlab docker runner container (alpine-3.9 and alpine-3.11) as well as docker on macOS.
Meanwhile, I'm able to run the same plan successfully on native macOS with terraform 0.12.16 and terraform-provider-kafka v0.2.2.
In the alpine container, the output stops like this with trace enabled:
...many repetitions of the same lines for different ACL entries...
2020-02-27T05:33:10.217Z [DEBUG] plugin.terraform-provider-kafka_v0.2.2: 2020/02/27 05:33:10 [INFO] Found ACL &{{2 redacted 4} [0xc000676690 0xc0006766c0 0xc0006766f0 0xc000676720 0xc000676750 0xc000676780 0xc0006767b0 0xc0006767e0]}
module.redacted.module.redacted.kafka_acl.allow_topic_read: Refreshing state... [id=User:redacted|*|Read|Allow|Topic|redacted|Prefixed]
2020-02-27T05:33:10.219Z [DEBUG] plugin.terraform-provider-kafka_v0.2.2: 2020/02/27 05:33:10 [INFO] Reading ACL
2020-02-27T05:33:10.219Z [DEBUG] plugin.terraform-provider-kafka_v0.2.2: 2020/02/27 05:33:10 [INFO] Reading ACL User:redacted|*|Read|Allow|Topic|redacted|Prefixed
2020-02-27T05:33:10.219Z [DEBUG] plugin.terraform-provider-kafka_v0.2.2: 2020/02/27 05:33:10 [INFO] Listing all ACLS
Thanks @mmajis , I'll try get this resolved this weekend.
I wasn't able to reproduce this with a local simple single broker cluster, but it does reproduce with Confluent Cloud.
The initial terraform plan
with new resources introduced was successful as was the subsequent terraform apply
. But running a plan after that stalls.
Here's a trace log from a run of 500 ACLs as well as the ACL entries I used.
Looks like #103 fixes the ACL plan issue. Could not reproduce the stall anymore. Thanks @Mongey !
So I am in the process of Migrating all of our topics from our chef setup to use this provider, however I am having some issues with the closing of connections. We have over 500 topics in one environment, so via 3 environments this is going to get a bit sticky.
I had to turn Trace on to find it but here is an issue I am running into
kafka-log-path.log