hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.53k stars 4.6k forks source link

DNS record resources complete creation before the resource is useable thereby breaking any deployment they are used in #24961

Open Neutrino-Sunset opened 7 months ago

Neutrino-Sunset commented 7 months ago

Is there an existing issue for this?

Community Note

Terraform Version

1.6.5

AzureRM Provider Version

3.91.0

Affected Resource(s)/Data Source(s)

azurerm_dns_txt_record, azurerm_dns_cname_record

Terraform Configuration Files

variable "base_domain" { }
variable "sub_domain" { }
variable "web_app" { }

data "azurerm_dns_zone" "dns-zone" {
  name                = var.base_domain
}

resource "azurerm_dns_txt_record" "domain-verification" {
  name                = "asuid.${var.sub_domain}"
  zone_name           = data.azurerm_dns_zone.dns-zone.name
  resource_group_name = data.azurerm_dns_zone.dns-zone.resource_group_name
  ttl                 = 3600

  record {
    value = var.web_app.custom_domain_verification_id
  }
}

resource "azurerm_dns_cname_record" "cname-record" {
  name                = var.sub_domain
  zone_name           = data.azurerm_dns_zone.dns-zone.name
  resource_group_name = data.azurerm_dns_zone.dns-zone.resource_group_name
  ttl                 = 3600
  record              = var.web_app.default_hostname
}

resource "azurerm_app_service_custom_hostname_binding" "hostname-binding" {
  hostname            = "${var.sub_domain}.${var.base_domain}"
  app_service_name    = var.web_app.name
  resource_group_name = var.web_app.resource_group_name

  depends_on = [azurerm_dns_txt_record.domain-verification, azurerm_dns_cname_record.cname-record]
}

Debug Output/Panic Output

module.custom_domain.azurerm_dns_cname_record.cname-record: Creating...
module.custom_domain.azurerm_dns_txt_record.domain-verification: Creating...
module.custom_domain.azurerm_dns_txt_record.domain-verification: Creation complete after 1s [id=/subscriptions/7c34cc50-2353-4be0-bd25-d43ce1e7856e/resourceGroups/simsemsdevopsresources/providers/Microsoft.Network/dnsZones/mydomain.net/TXT/asuid.tftest4-env16]
module.custom_domain.azurerm_dns_cname_record.cname-record: Creation complete after 1s [id=/subscriptions/7c34cc50-2353-4be0-bd25-d43ce1e7856e/resourceGroups/simsemsdevopsresources/providers/Microsoft.Network/dnsZones/mydomain.net/CNAME/tftest4-env16]
module.custom_domain.azurerm_app_service_custom_hostname_binding.hostname-binding: Creating...
╷
│ Error: creating/updating Custom Hostname Binding "tftest4-env16.mydomain.net" (App Service "tftest4-env16" / Resource Group "tftest4-env16"): web.AppsClient#CreateOrUpdateHostNameBinding: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="A TXT record pointing from asuid.tftest4-env16.mydomain.net to 05334922dcaf46e4cd40398f25d5c6e96a9c1e7833b716b140eaeb06b9f838aa was not found." Details=[{"Message":"A TXT record pointing from asuid.tftest4-env16.mydomain.net to 05334922dcaf46e4cd40398f25d5c6e96a9c1e7833b716b140eaeb06b9f838aa was not found."},{"Code":"BadRequest"},{"ErrorEntity":{"Code":"BadRequest","ExtendedCode":"04006","Message":"A TXT record pointing from asuid.tftest4-env16.mydomain.net to 05334922dcaf46e4cd40398f25d5c6e96a9c1e7833b716b140eaeb06b9f838aa was not found.","MessageTemplate":"A TXT record pointing from asuid.{0} to {1} was not found.","Parameters":["tftest4-env16.mydomain.net","05334922dcaf46e4cd40398f25d5c6e96a9c1e7833b716b140eaeb06b9f838aa"]}}]
│
│   with module.custom_domain.azurerm_app_service_custom_hostname_binding.hostname-binding,
│   on CustomDomain\customDomain.tf line 39, in resource "azurerm_app_service_custom_hostname_binding" "hostname-binding":
│   39: resource "azurerm_app_service_custom_hostname_binding" "hostname-binding" {

Expected Behaviour

Creation of the DNS records should check whether thoe records are created and propogated before indicating that resource creation is complete.

Actual Behaviour

THe DNS record resources complete creation before the DNS records are propogated so the deployment fails.

Steps to Reproduce

  1. terraform apply

Important Factoids

No response

References

No response

neil-yechenwei commented 7 months ago

Thanks for raising this issue. Could you double confirm if the hostname you specified is correct? Below is an example. Hopes it would be helpful.

Example 1:

data "azurerm_dns_zone" "example" {
  name                = "example.com"
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_dns_cname_record" "example" {
  name                = "www"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300
  record              = azurerm_app_service.example.default_site_hostname
}

resource "azurerm_dns_txt_record" "example" {
  name                = "asuid.${azurerm_dns_cname_record.example.name}"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300
  record {
    value = azurerm_app_service.example.custom_domain_verification_id
  }
}

resource "azurerm_app_service_custom_hostname_binding" "example" {
  hostname            = trim(azurerm_dns_cname_record.example.fqdn, ".")
  app_service_name    = azurerm_app_service.example.name
  resource_group_name = azurerm_resource_group.example.name
  depends_on          = [azurerm_dns_txt_record.example]
}

Example 2:

data "azurerm_dns_zone" "test" {
  name                = "xxxx"
  resource_group_name = "xxxx"
}

resource "azurerm_dns_cname_record" "test" {
  name                = "xxxx"
  zone_name           = data.azurerm_dns_zone.test.name
  resource_group_name = data.azurerm_dns_zone.test.resource_group_name
  ttl                 = 300
  record              = azurerm_app_service.test.default_site_hostname
}

resource "azurerm_dns_txt_record" "test" {
  name                = join(".", ["asuid", "xxxxx"])
  zone_name           = data.azurerm_dns_zone.test.name
  resource_group_name = data.azurerm_dns_zone.test.resource_group_name
  ttl                 = 300

  record {
    value = azurerm_app_service.test.custom_domain_verification_id
  }
}

resource "azurerm_app_service_custom_hostname_binding" "test" {
  hostname            = join(".", [azurerm_dns_cname_record.test.name, azurerm_dns_cname_record.test.zone_name])
  app_service_name    = azurerm_app_service.test.name
  resource_group_name = azurerm_resource_group.test.name
}
Neutrino-Sunset commented 7 months ago

I'm not sure whether I understand what you are asking.

In my particular use case I'm adding the cname and txt records necessary to create a new subdomain on an existing base domain, so my configuration is correct for that use case.

ElvenSpellmaker commented 4 months ago

This is happening to us also, a re-run of the Pipeline the code is in works, but it always fails the first time, it seems to be a timing bug in either the DNS resource or the custom domain resource(s) (in our case it's an azurerm_static_web_app_custom_domain)

Seeing as two separate resources have this error, it seems to be a problem with the DNS resource itself which returns before the DNS is usable.

Neutrino-Sunset commented 4 months ago

I'm working around the issue using a timer like this which has so far been working reliably.

variable "base_domain" { }
variable "sub_domain" { }
variable "web_app" { }

data "azurerm_dns_zone" "dns-zone" {
  name                = var.base_domain
}

resource "azurerm_dns_txt_record" "domain-verification" {
  name                = "asuid.${var.sub_domain}"
  zone_name           = data.azurerm_dns_zone.dns-zone.name
  resource_group_name = data.azurerm_dns_zone.dns-zone.resource_group_name
  ttl                 = 3600

  record {
    value = var.web_app.custom_domain_verification_id
  }
}

resource "azurerm_dns_cname_record" "cname-record" {
  name                = var.sub_domain
  zone_name           = data.azurerm_dns_zone.dns-zone.name
  resource_group_name = data.azurerm_dns_zone.dns-zone.resource_group_name
  ttl                 = 3600
  record              = var.web_app.default_hostname
}

resource "time_sleep" "wait_for_dns_records" {
  depends_on = [azurerm_dns_txt_record.domain-verification, azurerm_dns_cname_record.cname-record]

  create_duration = "20s"
}

resource "azurerm_app_service_custom_hostname_binding" "hostname-binding" {
  hostname            = "${var.sub_domain}.${var.base_domain}"
  app_service_name    = var.web_app.name
  resource_group_name = var.web_app.resource_group_name

  depends_on = [time_sleep.wait_for_dns_records]
}

It's not ideal though. A key aspect of the design of Terraform is that dependencies between resources are detected automatically and resources are created when the things they are dependent on are available, and the application logic analyzes the dependencies between resources so that as much of the deployment can occur in parallel as possible. Having to add timers with manually specified dependencies breaks that model and also adds greatly to the scripts complexity.

It would be great is this could be fixed.

owattley-rotageek commented 2 months ago

This problem happens to me frequently, as well. It's definitely a timing issue. Azure reports that the dns resource is created, so terraform continues to the next resource that depends on the dns record. However, in reality, Azure has yet to properly provision the dns record (maybe it has but there's a propagation delay?). The result is that the dependent resource fails because there is a dns check as part of provisioning.

A timeout fixes the issue, but it would be more correct if the provider could properly check that provisioning/dns propagation had finished first.