Open marceloboeira opened 1 year ago
Hello @marceloboeira, thanks for the detailed write up. As you mentioned this situation does not happens in the tests or in a single node cluster. The tests are also a bit peculiar here as none of them test an actual running service.
It is possible that the diff occurs after the Consul agent on the node running the service updates the check in Consul for the first time, which would be an async operation happening after the service is registered in the Consul catalog.
The diff is probably benign but we may be able to use a diff suppress function to hide the changes when this happens, if we can detect it reliably (we wouldn't want to hide actual changes by mistake).
I will make additional tests on my end, can you please post the complete diff if it happens again to you? It would help to understand what attributes are changing.
I'm not 100% sure if that's to TF providers fault or simply "the way consul works" but, almost every time I create a consul service (with checks) after the
terraform apply
, the nextterraform plan
includes a change with theservice check
information. Even thought it was already "published" to consul in the first plan/apply setup.Terraform Version
Affected Resource(s)
Terraform Configuration Files
Expected Behavior
Nothing should show up after plan/apply since the service check and everything service itself should've been created with the above code.
Actual Behavior
After the first plan/apply (possibly due to some async process on consul's side?) the next terraform plan shows:
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
terraform plan
terraform apply
wait a few minutes to be sure
terraform plan (without any .TF code change)
See weird "already applied" changes
Important Factoids
What I'm unsure of is if this:
Checking the code for the
create
part, I don't see any major issues:https://github.com/hashicorp/terraform-provider-consul/blob/9c5772f607ad26325c6bab96917fb41f875dd621/consul/resource_consul_service.go#L234-L253
Then checking how it is read also, nothing big other than it relies on those values being there in the first place:
https://github.com/hashicorp/terraform-provider-consul/blob/9c5772f607ad26325c6bab96917fb41f875dd621/consul/resource_consul_service.go#L271C1-L344
My money would be on
service.Checks
being empty in the first "read" during the apply but populated later on further reads:https://github.com/hashicorp/terraform-provider-consul/blob/9c5772f607ad26325c6bab96917fb41f875dd621/consul/resource_consul_service.go#L302
Finally, what leads me to believe it is a consul "problem" is that the tests do not have this issue. Possibly, a slight delay on replicating and different nodes being the ones to receive the "write" vs "read" requests could. The weird part is why would the service itself be replicated but not the service check...
If that is the case, is there anything specific that can be done to perhaps reduce the likelihood of that happening?