Open marrobi opened 1 year ago
@guybartal seen this before?
no, I haven't. looks like it fails on creating the public (host) subnet, maybe a transient error? did you try to redeploy?
Got this again here:
1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a: Error message: [31m╷[0m[0m [31m│[0m [0m[1m[31mError: [0m[0m[1mwaiting for creation of Subnet: (Name "adb-host-subnet-mrtredemo28-ws-8044-svc-459a" / Virtual Network Name "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044"): Code="Canceled" Message="Operation was canceled." Details=[{"code":"CanceledAndSupersededDueToAnotherOperation","message":"Operation PutSubnetOperation (f0dd77c7-05fd-4208-aa55-f62650568667) was canceled and superseded by operation PutVirtualNetworkOperation (b5f36438-7876-4e51-8e3a-36fc10f79daf)."}][0m [31m│[0m [0m [31m│[0m [0m[0m with azurerm_subnet.host, [31m│[0m [0m on network.tf line 90, in resource "azurerm_subnet" "host": [31m│[0m [0m 90: resource "azurerm_subnet" "host" [4m{[0m[0m [31m│[0m [0m [31m╵[0m[0m [31m╷[0m[0m [31m│[0m [0m[1m[31mError: [0m[0m[1mSubnet: (Name "adb-container-subnet-mrtredemo28-ws-8044-svc-459a" / Virtual Network Name "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044") was not found[0m [31m│[0m [0m [31m│[0m [0m[0m with azurerm_subnet_network_security_group_association.container, [31m│[0m [0m on network.tf line 147, in resource "azurerm_subnet_network_security_group_association" "container": [31m│[0m [0m 147: resource "azurerm_subnet_network_security_group_association" "container" [4m{[0m[0m [31m│[0m [0m [31m╵[0m[0m [31m╷[0m[0m [31m│[0m [0m[1m[31mError: [0m[0m[1mSubnet "adb-container-subnet-mrtredemo28-ws-8044-svc-459a" (Virtual Network "vnet-mrtredemo28-ws-8044" / Resource Group "rg-mrtredemo28-ws-8044") was not found![0m [31m│[0m [0m [31m│[0m [0m[0m with azurerm_subnet_route_table_association.rt_container, [31m│[0m [0m on network.tf line 157, in resource "azurerm_subnet_route_table_association" "rt_container": [31m│[0m [0m 157: resource "azurerm_subnet_route_table_association" "rt_container" [4m{[0m[0m [31m│[0m [0m [31m╵[0m[0m error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var address_space=10.1.8.0/24 -var arm_environment=public -var is_exposed_externally=false -var tre_id=mrtredemo28 -var tre_resource_id=1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a -var workspace_id=14d01527-62d1-4bad-99ad-37d602c08044: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var address_space=10.1.8.0/24 -var arm_environment=public -var is_exposed_externally=false -var tre_id=mrtredemo28 -var tre_resource_id=1f350b8a-736f-4ff8-9a5c-ca3bbc8c459a -var workspace_id=14d01527-62d1-4bad-99ad-37d602c08044: exit status 1 1 error occurred: * mixin execution failed: package command failed
Issue seems to be related to multiple workspace services being deployed/updated in parallel and/or multiple private endpoints/network operations happening in parallel in a single bundle.
Another
Error: waiting for creation of Private Endpoint "pe-mlflow-mrtredemo28-ws-8044-svc-89f1" (Resource Group "rg-mrtredemo28-ws-8044"): Code="RetryableError" Message="A retryable error occurred." Details=[{"code":"ReferencedResourceNotProvisioned","message":"Cannot proceed with operation because resource /subscriptions/7f1036b4-4d01-43a0-9f4d-602f5151dc0f/resourceGroups/rg-mrtredemo28-ws-8044/providers/Microsoft.Network/virtualNetworks/vnet-mrtredemo28-ws-8044/subnets/ServicesSubnet used by resource /subscriptions/7f1036b4-4d01-43a0-9f4d-602f5151dc0f/resourceGroups/rg-mrtredemo28-ws-8044/providers/Microsoft.Network/networkInterfaces/pe-mlflow-mrtredemo28-ws-8044-svc-89f1.nic.b228d946-de36-46c2-81ee-1e6b06155123 is not in Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is PutSubnetOperation."}]
Ok, this is down to having two operations in progress on the virtual network. On the virtual network. This can happen if one is adding an address space to a workspace in one operation, when another is adding a subnet to the virtual network at the same time.
We need to limit workspace and workspace service operations to one at a time for each workspace.
As user resources to not typically modify the network, do not believe they are an issue.
Or should the TF provider wait if an operation is in progress?
When deploying the Databricks Workspace service get: