aequitas / terraform-provider-transip

Terraform provider to manage Transip resources
https://registry.terraform.io/providers/aequitas/transip/latest/docs
MIT License
35 stars 16 forks source link

Access token labels are still reused concurrently which breaks the planning phase. #71

Open EraYaN opened 1 year ago

EraYaN commented 1 year ago

So we have a couple of hundred records and the problem is that it will still give the same error as reported in #44 especially during plans. Of course using -parallelism=1 fixes it, but it might work better to make that key generation a bit more unique, maybe add the current thread id number or something.

This seems to always happen across domains;

For example;

Error: failed to lookup domain "domain3.nl": could not get token from authenticator: error requesting token: The label 'gotransip-client-1681375048247176700' is already used in another active access token.
Error: failed to lookup domain "domain2.nl": could not get token from authenticator: error requesting token: The label 'gotransip-client-1681375048247176700' is already used in another active access token.
Error: failed to lookup domain "domain1.nl": could not get token from authenticator: error requesting token: The label 'gotransip-client-1681375048247176700' is already used in another active access token.

This is with version:

Terraform v1.4.4
on windows_amd64
+ provider registry.terraform.io/aequitas/transip v0.1.19
aequitas commented 1 year ago

Are you running in the same environment as #44? I've not been able to trigger this issue myself, do you have a way to reliably replicate it so I can test it.

EraYaN commented 1 year ago

Right so this is running on Windows 11 locally, with Azure Blob Storage as the backend for state. While using Terragrunt (this should have no impact it happened on plain terraform as well)

It is sadly intermittent, happens most of the time but not all of the time. It happens with a very high number of domains (we are at about 40-50 ish) more often. I have another definition with 2 domains and maybe 4 records and it has never happened for that one before. But when you run it with high parallelism (default is 10 I believe?) it will hit some race somewhere and then that error gets generated, often many many times.

I don't know how to make a minimal reproducer since having that many domains and records is kind of a requirement it seems.

A quick way to get a ton of records is like this: (adapted from our own definitions)

data "transip_domain" "base-domains" {
  for_each = var.base-domains
  name = each.key
}

resource "transip_dns_record" "base-domains-wildcard-a" {
  for_each = data.transip_domain.base-domains
  domain  = each.value.id
  name    = "*.test"
  type    = "A"
  expire  = 3600
  content = ["127.0.0.1"]
}

resource "transip_dns_record" "base-domains-a" {
  for_each = data.transip_domain.base-domains
  domain  = each.value.id
  name    = "test"
  type    = "A"
  expire  = 3600
  content = ["127.0.0.1"]
}

variable "base-domains" {
  description = "All domains that need the base DNS records."
  type        = set(string)
  default = [
    # many domains in an account, 40+ for extra effect.
  ]
}

And in addition to this we also have one domain with about 100 records by it self.