microsoft / AzureTRE

An accelerator to help organizations build Trusted Research Environments on Azure.
https://microsoft.github.io/AzureTRE
MIT License
180 stars 138 forks source link

Firewall deployment issue - internal server error #4091

Open jonnyry opened 1 week ago

jonnyry commented 1 week ago

Last three attempts to run a full deployment of the 0.19.0 version of the Azure TRE have resulted in the Firewall Shared Service component failing after approx 10 minutes with error below.

I have FIREWALL_SKU=Basic. I am about to see if FIREWALL_SKU=Standard makes any difference.

eb54d928-9c43-4652-8740-38f366e4809f deployment_failed install /shared-services/14703802-2678-4b1b-8d85-13e102572fb6 14703802-2678-4b1b-8d85-13e102572fb6: Error message: Unable to find image '.azurecr.io/tre-shared-service-firewall@sha256:79e68d2866ef7a40f121c4c029b6561d2718551911b15c40d0831206811a6a12' locally ╷ │ Error: waiting for creation/update of Firewall: (Azure Firewall Name "fw-" / Resource Group "rg-"): Code="InternalServerError" Message="An error occurred." Details=[] │ │ with azurerm_firewall.fw, │ on firewall.tf line 30, in resource "azurerm_firewall" "fw": │ 30: resource "azurerm_firewall" "fw" │ ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 1 error occurred: * mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: waiting for creation/update of Firewall: (Azure Firewall Name "fw-" / Resource Group "rg-"): Code="InternalServerError" Message="An error occurred." Details=[] │ │ with azurerm_firewall.fw, │ on firewall.tf line 30, in resource "azurerm_firewall" "fw": │ 30: resource "azurerm_firewall" "fw" │ ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 1 error occurred: mixin execution failed: package command failed /cnab/app/cnab/app/mixins/terraform/runtimes/terraform-runtime install ╷ │ Error: waiting for creation/update of Firewall: (Azure Firewall Name "fw-**" / Resource Group "rg-"): Code="InternalServerError" Message="An error occurred." Details=[] │ │ with azurerm_firewall.fw, │ on firewall.tf line 30, in resource "azurerm_firewall" "fw": │ 30: resource "azurerm_firewall" "fw" │ ╵ error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 Error: error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var api_driven_network_rule_collections_b64=W10= -var api_driven_rule_collections_b64=W10= -var firewall_sku=Basic -var microsoft_graph_fqdn=graph.microsoft.com -var tre_id= -var tre_resource_id=14703802-2678-4b1b-8d85-13e102572fb6: exit status 1 1 error occurred: container exit code: 1, message: ; Command executed: porter install "14703802-2678-4b1b-8d85-13e102572fb6" --reference **.azurecr.io/tre-shared-service-firewall:v1.2.0 --param arm_environment="public" --param arm_use_msi="true" --param firewall_sku="Basic" --param id="14703802-2678-4b1b-8d85-13e102572fb6" --param microsoft_graph_fqdn="graph.microsoft.com" --param tfstate_container_name="tfstate" --param tfstate_resource_group_name="" --param tfstate_storage_account_name="" --param tre_id="" --force --credential-set arm_auth --credential-set aad_auth make: *** [Makefile:305: deploy-shared-service] Error 1

jonnyry commented 1 week ago

OK so after changing FIREWALL_SKU=Standard, the deployment worked. I have several deployments to make, so I will report back if this continues to suceed with FIREWALL_SKU=Standard.

jonnyry commented 1 week ago

Ah no, second deployment of FIREWALL_SKU=Standard yielded the "Internal Server Error" issue above:

image

tim-allen-ck commented 1 week ago

is it the terraform provider timing out?

tim-allen-ck commented 1 week ago

think these are related #4088

jonnyry commented 1 week ago

is it the terraform provider timing out?

No I don't think so - the firewall component usually takes between 25-35 minutes to deploy. The errors above are returning in < 10 mins.

jonnyry commented 1 week ago

This looks like the same issue: https://learn.microsoft.com/en-us/answers/questions/1666583/firewall-creation-is-failing-while-creating-throug

tim-allen-ck commented 1 week ago

This looks like the same issue: https://learn.microsoft.com/en-us/answers/questions/1666583/firewall-creation-is-failing-while-creating-throug

Definitely similar, but looks like there error was in part due to the vpn gateway