Open l-with opened 3 years ago
This seems to be the same issue i encountered in #30.
Can you provide the Terraform source files, so I can reproduce on my machine?
I do that very gladly: https://github.com/l-with/Hetzner-DNS
sorry I pushed initially with the null_resource deactivated, the error only happens with the "long run", I pushed a new version right now
Good news :-). I was able to reproduce the error on my local machine. It seems the provider made the two request to create the DNS records almost at the same time. The first that hits the API wins while the second is rejected. This is probably an Api input validation issue..
However, after updating to v2.0.0 of this provider it works. I upgraded to Terraform provider SDK version 2 and now the resources are created one after another.
I suggest you try version 2.0.0. In the demo project you shared (thanks a lot for the effort), just edit this line.
Looking forward to hearing from you.
I am very happy that you were able to reproduce the error and thank you very much for your effort and fast response.
I updated the provider version (s. my repo) but still get the error.
The log (TF_LOG=trace) filtered by dns.hetzner.com is:
2022-01-22T07:17:54.182+0100 [INFO] provider.terraform-provider-hetznerdns_v2.0.0: 2022/01/22 07:17:54 [DEBUG] HTTP request to API GET https://dns.hetzner.com/api/v1/zones?name=with42.de: timestamp=2022-01-22T07:17:54.181+0100
2022-01-22T07:17:54.188+0100 [INFO] provider.terraform-provider-hetznerdns_v2.0.0: 2022/01/22 07:17:54 [DEBUG] HTTP request to API GET https://dns.hetzner.com/api/v1/zones?name=with42.de: timestamp=2022-01-22T07:17:54.188+0100
2022-01-22T07:17:54.839+0100 [INFO] provider.terraform-provider-hetznerdns_v2.0.0: 2022/01/22 07:17:54 [DEBUG] HTTP request to API POST https://dns.hetzner.com/api/v1/records: timestamp=2022-01-22T07:17:54.839+0100
2022-01-22T07:17:54.848+0100 [INFO] provider.terraform-provider-hetznerdns_v2.0.0: 2022/01/22 07:17:54 [DEBUG] HTTP request to API POST https://dns.hetzner.com/api/v1/records: timestamp=2022-01-22T07:17:54.848+0100
2022-01-22T07:17:55.522+0100 [INFO] provider.terraform-provider-hetznerdns_v2.0.0: 2022/01/22 07:17:55 [DEBUG] HTTP request to API GET https://dns.hetzner.com/api/v1/records/3377e0be7dc639c9ec5459aa587a9a3f: timestamp=2022-01-22T07:17:55.521+0100
This really looks like as if the posts to the API happen at nearly the same time. This correlates to the effect that occurs time and again when I configure hetzner-dns with ansible via the API and start the playbook in null_resource's with terraform lifecycle for all hosts of a cluster. The null_resource's are cerated by terraform simultaneously. This speaks for your opinion about the API.
I am sorry that I do not know enough about implementing terraform providers to help with coding.
Is there a chance for implementing retries for posts (at worst with an ALOHA like timing) to circumvent this problem?
BTW, right now just the same error (HTTP Errror 422: Unprocessable Entity) occured with the ansible playbook (on the second null_resource), I mentioned in the above comment:
I solved the problem by using https://pkg.go.dev/github.com/hashicorp/go-retryablehttp and opened a pull request.
This is the debug log with the added retries:
2022-02-01T12:36:22.190+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] Updating resource record: timestamp=2022-02-01T12:36:22.189+0100
2022-02-01T12:36:22.190+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] HTTP request to API POST https://dns.hetzner.com/api/v1/records: timestamp=2022-02-01T12:36:22.190+0100
2022-02-01T12:36:22.190+0100 [DEBUG] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] POST https://dns.hetzner.com/api/v1/records
2022-02-01T12:36:22.191+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] Updating resource record: timestamp=2022-02-01T12:36:22.190+0100
2022-02-01T12:36:22.191+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] HTTP request to API POST https://dns.hetzner.com/api/v1/records: timestamp=2022-02-01T12:36:22.190+0100
2022-02-01T12:36:22.191+0100 [DEBUG] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] POST https://dns.hetzner.com/api/v1/records
2022-02-01T12:36:22.553+0100 [DEBUG] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:22 [DEBUG] POST https://dns.hetzner.com/api/v1/records (status: 422): retrying in 1s (10 left)
2022-02-01T12:36:23.119+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:23 [DEBUG] Reading resource record: timestamp=2022-02-01T12:36:23.119+0100
2022-02-01T12:36:23.119+0100 [INFO] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:23 [DEBUG] HTTP request to API GET https://dns.hetzner.com/api/v1/records/f27f297bacc987fdb680ab757a7450e2: timestamp=2022-02-01T12:36:23.119+0100
2022-02-01T12:36:23.119+0100 [DEBUG] provider.terraform-provider-hetznerdns: 2022/02/01 12:36:23 [DEBUG] GET https://dns.hetzner.com/api/v1/records/f27f297bacc987fdb680ab757a7450e2
The Errror 422 returned by the second API call is healed by a retry automatically by retryablehttp.
@timohirt Can I help and do something to fix the failed checks of the merge request? (I executed the tests locally succesfull.)
I had to merge it into another branch in this repository. The API token in stored in a GitHub Secret which wasn't shared with your fork. It's still not working unfortunately and I have no idea why. Need to look into it.
Ok, got it. The Retryable Client was created outside of the factory method that is used to inject a fake client in tests. Fixed that and now the ci build is green.
@l-with please check if https://github.com/timohirt/terraform-provider-hetznerdns/releases/tag/v2.1.0 fixes this for you.
I updated to the provider version 2.1.0 in https://github.com/l-with/Hetzner-DNS and tried a few times apply and destroy without any error. I am very happy to start refactoring my deployments to use the provider to create DNS record instead using the Hetzner API with Ansible. @timohirt Thank you very much!
Thank you for your contribution @l-with :-)
I have project creating a lot of DNS entries for a mail server and get:
Post "https://dns.hetzner.com/api/v1/records": POST https://dns.hetzner.com/api/v1/records giving up after 11 attempt(s)
Thus 10 retries is not enough, perhaps other values for the retryableClient
could help, the defaults are
// Default retry configuration
defaultRetryWaitMin = 1 * time.Second
defaultRetryWaitMax = 30 * time.Second
defaultRetryMax = 4
I would prefer to simply increase retryableClient.RetryMax
to a much higer value (30), since in normal cases this does not matter.
@timohirt: Shall I create a merge request for this simple change?
Sure, go ahead
I'm sorry: In this case I mixed up timinig problems producing error "HTTP Errror 422: Unprocessable Entity" and real semantical errors from the Hetzner DNS API also responding "HTTP Errror 422: Unprocessable Entity".
For instance "HTTP Errror 422: Unprocessable Entity" is returned for creating a CNAME record with a name already existing as A record.
Thus 10 for retryMax is high enough, perhaps too high, since it produces a waiting time of 1s (10 left) + 2s (9 left) + 4s (8 left) + 8s (7 left) + 16s (6 left) + 30s (5 left) + 30s (4 left) + 30s (3 left) + 30s (2 left) + 30s (1 left), that is in sum 3 minutes 31 seconds also for semantical errors from the Hetzner DNS API.
5 retries would be my suggestion. Please comment.
Terraform provider for Hetzner DNS is great, thanks!
I have a bug, that seems to be a timing problem. "long run" in my "real world" is an ansible playbook.
Source-Code Tree is:
terraform version :
./main.tf :
./modules/dns/main.tf :
terraform apply --auto-approve
Sorry for the long snippets, I can also provide the output with TF_LOG=trace if necessary.