Open arbitmcdonald opened 2 years ago
@arbitmcdonald Thank you for submitting this!
We have a nightly test case for the aadds resource, whose configuration is defined here: https://github.com/hashicorp/terraform-provider-azurerm/blob/05362bb7236ab7ff91dbe07dda7bec8bb154ff65/internal/services/domainservices/active_directory_domain_service_test.go#L213 The test is successful in recent runs.
From the error message, it is something went wrong in Azure when it was checking connectivity internally, during the creation (long running) operation. That most likely because of the service side issue. So I would suggest you to raise an Azure support ticket by providing the X-Ms-Correlation-Request-Id: 30b7687d-538e-4564-2be5-6acfd84f0498
.
By comparing the configurations between what is tested and yours, one possible cause might be the sku
of the aadds is different, where you were using Standard
, and the acctest was using Enterprise
.
Thanks @magodo I'll reach out to their support.
Interestingly the creation does succeed (within Azure) about an hour later, which is normal for AADDS. Its terraform that bombs out/fails, the resource creation still succeeds.
I'll try with Enterprise next and see what happens.
I really appreciate your detailed and helpful response!
Just an update on this. I changed my SKU to see if it made a difference and the same error happened.
Error: creating/updating Domain Service (Name: "redacted.onmicrosoft.com", Resource Group: "RG-UKS-AADDS"): polling after CreateOrUpdate:
Code="InternalError"
Message="Error testing domain controller connectivity through PowerShell. A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 20.26.161.131:5986"
I just had another swing at this, and rather than destroying all successfully created resources after the AADDS failure I thought it best to have a proper look around. Even though I told Terraform that the AADDS depends on the vNet, AADDS subnet, AADDS NSG, and AADDS NSG/subnet association, the association was not there in Azure.
Terraform created the vNet, Subnet, and NSG, but it did not associate the NSG with the Subnet before creating AADDS.
Root cause identified... issue still remains. Why is AADDS being created before Terraform associated the NSG with the subnet, when I specifically said AADDS depends on the NSG association?
// 1. Create the network
resource "azurerm_virtual_network" "primary" {
name = "VNet-${upper(var.client_code)}-${upper(var.location_primary_code)}-01"
location = azurerm_resource_group.management_primary.location
resource_group_name = azurerm_resource_group.management_primary.name
address_space = [var.vnet_address_space_primary]
depends_on = [
azurerm_resource_group.management_primary
]
}
// 2. Create the subnet
resource "azurerm_subnet" "aadds_primary" {
name = "SUBNET-${upper(var.client_code)}-${upper(var.location_primary_code)}-AADDS"
resource_group_name = azurerm_resource_group.management_primary.name
virtual_network_name = azurerm_virtual_network.primary.name
address_prefixes = ["10.0.1.0/27"]
depends_on = [
azurerm_virtual_network.primary,
azurerm_resource_group.management_primary
]
}
// 3. Create the NSG
resource "azurerm_network_security_group" "aadds_primary" {
name = "NSG-${upper(var.client_code)}-${upper(var.location_primary_code)}-ACCESS"
location = azurerm_resource_group.access_primary.location
resource_group_name = azurerm_resource_group.access_primary.name
security_rule {
name = "AllowSyncWithAzureAD"
priority = 101
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefix = "AzureActiveDirectoryDomainServices"
destination_address_prefix = "*"
}
security_rule {
name = "AllowRD"
priority = 201
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "3389"
source_address_prefix = "CorpNetSaw"
destination_address_prefix = "*"
}
security_rule {
name = "AllowPSRemoting"
priority = 301
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "5986"
source_address_prefix = "AzureActiveDirectoryDomainServices"
destination_address_prefix = "*"
}
security_rule {
name = "AllowLDAPS"
priority = 401
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "636"
source_address_prefix = "*"
destination_address_prefix = "*"
}
depends_on = [
azurerm_resource_group.access_primary
]
}
// 4. Associate the NSG
resource "azurerm_subnet_network_security_group_association" "aadds_primary" {
subnet_id = azurerm_subnet.aadds_primary.id
network_security_group_id = azurerm_network_security_group.aadds_primary.id
depends_on = [
azurerm_virtual_network.primary,
azurerm_subnet.aadds_primary,
azurerm_network_security_group.aadds_primary
]
}
// 5. Create AADDS
resource "azurerm_active_directory_domain_service" "primary" {
name = var.onmicrosoft_domain
location = azurerm_resource_group.aadds.location
resource_group_name = azurerm_resource_group.aadds.name
domain_name = var.onmicrosoft_domain
sku = "Enterprise"
filtered_sync_enabled = false
initial_replica_set {
subnet_id = azurerm_subnet.aadds_primary.id
}
notifications {
additional_recipients = ["${join("@", [var.admin_username, var.onmicrosoft_domain])}"]
notify_dc_admins = true
notify_global_admins = true
}
security {
kerberos_armoring_enabled = true
kerberos_rc4_encryption_enabled = true
ntlm_v1_enabled = true
sync_kerberos_passwords = true
sync_ntlm_passwords = true
sync_on_prem_passwords = true
tls_v1_enabled = true
}
depends_on = [
azurerm_virtual_network.primary,
azurerm_subnet.aadds_primary,
azurerm_network_security_group.aadds_primary,
azurerm_subnet_network_security_group_association.aadds_primary,
azuread_group_member.admin,
azurerm_resource_group.aadds,
azuread_service_principal.aadds_primary,
azurerm_virtual_network_dns_servers.aadds_dns_primary,
]
}
Update on this, I believe there's an issue with the provider, not with Azure, as Terraform reports the creation complete for my NSG association.
Here's what happens:
If I manually update the nsg association (to apply the nsg to the subnet) while terraform is applying the plan at step 3 (after supposed creation of the nsg association, before AADDS creation), the Terraform apply succeeds and AADDS is created.
Notable console messages:
azurerm_subnet_network_security_group_association.aadds_primary: Creating...
azurerm_subnet_network_security_group_association.aadds_primary: Creation complete after 3s
Console output:
azuread_service_principal.aadds_primary: Creating...
azuread_group.aadds_administrators: Creating...
azuread_service_principal.aadds_primary: Creation complete after 2s [id=ed4ce269-69c0-4c4f-a705-redacted]
azuread_group.aadds_administrators: Still creating... [10s elapsed]
azurerm_resource_group.management_primary: Creating...
azurerm_resource_group.access_primary: Creating...
azurerm_resource_group.aadds: Creating...
azurerm_resource_group.management_primary: Creation complete after 0s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-MANAGEMENT]
azurerm_virtual_network.primary: Creating...
azurerm_resource_group.aadds: Creation complete after 0s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-AADDS]
azurerm_resource_group.access_primary: Creation complete after 0s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-ACCESS]
azurerm_network_security_group.aadds_primary: Creating...
azurerm_network_security_group.aadds_primary: Creation complete after 4s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-ACCESS/providers/Microsoft.Network/networkSecurityGroups/NSG-LWL-UKS-ACCESS]
azurerm_virtual_network.primary: Creation complete after 4s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-MANAGEMENT/providers/Microsoft.Network/virtualNetworks/VNet-LWL-UKS-01]
azurerm_subnet.aadds_primary: Creating...
azuread_group.aadds_administrators: Still creating... [20s elapsed]
azurerm_subnet.aadds_primary: Creation complete after 4s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-MANAGEMENT/providers/Microsoft.Network/virtualNetworks/VNet-LWL-UKS-01/subnets/SUBNET-LWL-UKS-AADDS]
azurerm_virtual_network_dns_servers.aadds_dns_primary: Creating...
azurerm_subnet_network_security_group_association.aadds_primary: Creating...
azuread_group.aadds_administrators: Creation complete after 22s [id=adb56c4e-43a8-4869-ab0f-redacted]
azuread_user.admin: Creating...
azuread_user.admin: Creation complete after 0s [id=4b93e93b-62ac-4b14-a3ac-redacted]
azuread_group_member.admin: Creating...
azuread_group_member.admin: Creation complete after 1s [id=adb56c4e-43a8-4869-ab0f-redacted/member/4b93e93b-62ac-4b14-a3ac-redacted]
azurerm_subnet_network_security_group_association.aadds_primary: Creation complete after 3s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-MANAGEMENT/providers/Microsoft.Network/virtualNetworks/VNet-LWL-UKS-01/subnets/SUBNET-LWL-UKS-AADDS]
azurerm_virtual_network_dns_servers.aadds_dns_primary: Creation complete after 7s [id=/subscriptions/...redacted.../resourceGroups/RG-LWL-UKS-MANAGEMENT/providers/Microsoft.Network/virtualNetworks/VNet-LWL-UKS-01/dnsServers/default]
azurerm_active_directory_domain_service.primary: Creating...
azurerm_active_directory_domain_service.primary: Still creating... [10s elapsed]
Is there an existing issue for this?
Community Note
Terraform Version
1.3.2
AzureRM Provider Version
3.26.0
Affected Resource(s)/Data Source(s)
azurerm_active_directory_domain_service
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
The creation should have continued for another hour or so, at which point Azure Active Directory Domain Services would have been created. This used to work perfectly, but I updated AzureRM and a ton of my config has been changed as a result due to breaking changes in the more recent version(s). I'm not sure if it's my config somehow at fault, or the provider.
Actual Behaviour
The creation runs for 15-16 minutes before throwing the following error: Error testing domain controller connectivity through PowerShell. A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 20.26.9.56:5986
Steps to Reproduce
terraform plan -target="azurerm_active_directory_domain_service.primary" -out="aadds.tfplan"
terraform.exe apply "aadds.tfplan"
Also happens if I just run
terraform apply
, but this config is a snippet of a much larger file. I usually create AADDS first, as it takes so long, and then spin up the rest of the plan. This also fails now.Important Factoids
No response
References
No response