500 error when creating a mssql database

jwshive commented 4 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

terraform 0.12.28 x64

Affected Resource(s)

azurerm_mssql_database

Terraform Configuration Files

resource "azurerm_mssql_database" "sql_database" {
  name                = "#{agency}#-#{department}#-#{environment}#-#{application}#-db"
  server_id           = module.application-sql_server.id
  collation           = "SQL_Latin1_General_CP1_CI_AS"
  license_type        = "BasePrice"
  read_scale          = false
  sku_name            = "ElasticPool"
  zone_redundant      = false
  tags = {
    environment = "#{environment}#"
    application = "#{application}#"
    product_owner = "#{product-owner}#"
    service_owner = "#{service-owner}#"
    cost_center = "#{cost-center}#"
    type        = "mssql-database"
  }
  elastic_pool_id     = azurerm_mssql_elasticpool.sql_elastic_pool.id
  extended_auditing_policy {
    storage_endpoint                        = module.storage-working-storage_account.primary_blob_endpoint
    storage_account_access_key              = module.storage-working-storage_account.primary_access_key
    storage_account_access_key_is_secondary = true
    retention_in_days                       = 6
  }
  timeouts {
    create = "2h"
    delete = "2h"
  }
}

Debug Output

https://gist.github.com/klainn/6318e4dcbf117f8e4e1741d05694cdbe ### Panic Output

Expected Behavior

Expect terraform (through azure devops) to wait at least 2 hours before failing.

Actual Behavior

Terraform (through azure devops) waits 11 minutes and fails.

Steps to Reproduce

Rerun release pipeline with Terraform validate and apply agent.

terraform apply

Important Factoids

References

0000

yupwei68 commented 4 years ago

Hi @klainn , thanks for opening this issue. timeouts is the time limit for each operation lasting. It will not sleep for such time. I assume Terraform waits 11 minutes and fails for some errors. Could you explicit what do you expect?

jwshive commented 4 years ago

My expectation is that terraform waits to get a all good response back from azure or the pipeline agent. I know the database gets created as I can hit it from the portal or a third party database tool. It just seems to take longer than 11 minutes to get a positive return code.

yupwei68 commented 4 years ago

Hi @klainn The database fails to be created because the sql auditing fails to connect to the storage account,(we might need a more specific configuration of storage account to get the reason) and it returns error message, which is not relevant to timeout. Do you expect the error message to be returned after 2h? That's not the feature timeouts is designed.

tombuildsstuff commented 4 years ago

@klainn as @yupwei68 has mentioned, within the Azure Provider we use the value of the timeouts block as the maximum creation time during the provisioning of the resource, which happens in two parts: creating/updating and the subsequent polling.

In this instance it appears this 500 error is coming back from the initial create/update request - and as such unfortunately this is a bug in the Azure API, rather than something specific to Terraform. Whilst Terraform could look to continually poll for 500 responses (and we do when long-polling) - there's no guarantee that'd work (since by definition it's an internal server error / a bug in the API) - so unfortunately this is a bug which would need to be fixed in the Azure API, to correctly show this is still being created.

@yupwei68 would you mind filing an issue on the Go SDK/Rest API Specs repository to track this 500 error being returned from the SQL API?

jwshive commented 4 years ago

I am assuming the workaround for this would be:

let the pipeline fail to create the instance as it is currently doing
terraform import into the remote back end since the resource is actually being created
rerun my pipeline and get a success

however for some reason, by only redeploying my failed run and not creating a new release, it's trying to modify the database that I imported and still failing with another error 500

Blob Auditing Policies(SQL Server "agency-dept-dev-app-sqlsvr"/ Resource Group "agency-dept-dev-app-rg"): sql.ExtendedDatabaseBlobAuditingPoliciesClient#CreateOrUpdate: Failure responding to request: StatusCode=500 -- Original Error: autorest/azure: Service returned an error. Status=500 Code="InternalServerError" Message="An unexpected error occured while processing the request. Tracking ID: 'a063eb77-e541-4c95-9edf-a12efc38d938'"

What is an actual workaround to get a functioning system and a clean release pipeline for a sql database?

yupwei68 commented 4 years ago

An upstream issue has been opened to track this internal error: https://github.com/Azure/azure-rest-api-specs/issues/10411

yupwei68 commented 4 years ago

Hi @jwshive Sorry that the service team has just replied me that the track id is too old to track. If this internal error has occurred recently again, would you mind providing a new track id for the service team?

jwshive commented 4 years ago

Here are some new tracking ids for the same db create on the same run. It is fresh as of a few minutes ago. Hopefully it's not another 20 days before they circle back around to this issue.

Tracking ID: '07c2b535-cc50-47d0-9d66-578b4f14b04f'

Tracking ID: 'afa5f963-392a-4b54-8e86-353bddc53268'

Tracking ID: '4f8e23ba-a6a3-4341-a6fe-c36daecfbaf4'

This is the code I test with. This was run with a local state on my local machine.

provider "azurerm" {
  features {}
}

terraform {
  backend "azurerm" {
    resource_group_name = "agency-dept-tst-tfrg"
    storage_account_name = "agencydeptapptsttfsa"
    container_name = "app-terraform"
    key = "terraform.tfstate"
    }
}

data "azurerm_client_config" "current_client_config" {}

resource "azurerm_resource_group" "resource_group" {
  location = "eastus"
  name = "agency-dept-tst-app-rg"
}

resource "azurerm_storage_account" "working-storage_account" {
  name = "agencydeptapptstwksa"
  resource_group_name       = azurerm_resource_group.resource_group.name
  location = "eastus"
  account_tier = "Standard"
  account_replication_type = "RAGRS"
  account_kind              = "StorageV2"
  is_hns_enabled            = "true"
  access_tier               = "Hot"
  allow_blob_public_access  = false
  depends_on = [azurerm_resource_group.resource_group]
}
variable "sql-server-username" {}
variable "sql-server-password" {}
resource "azurerm_mssql_server" "sql_server" {
  name                         = "agency-dept-tst-app-sqlsvr"
  resource_group_name          = azurerm_resource_group.resource_group.name
  location                     = "eastus"
  version                      = "12.0"
  administrator_login          = var.sql-server-username
  administrator_login_password = var.sql-server-password
  depends_on = [azurerm_resource_group.resource_group]
}

resource "azurerm_mssql_elasticpool" "sql_elastic_pool" {
  name = "agency-dept-tst-app-epool"
  location = "eastus"
  resource_group_name = azurerm_resource_group.resource_group.name
  server_name = azurerm_mssql_server.sql_server.name
  max_size_gb = 4.8828125

  per_database_settings {
    min_capacity = 0
    max_capacity = 5
  }

  sku {
    capacity = 50
    name = "BasicPool"
    tier = "Basic"
  }
  depends_on = [azurerm_mssql_server.sql_server]
}

resource "azurerm_mssql_database" "sql_database" {
  name                = "agency-dept-tst-app-db"
  server_id           = azurerm_mssql_server.sql_server.id
  collation           = "SQL_Latin1_General_CP1_CI_AS"
  license_type        = "BasePrice"
  read_scale          = false
  sku_name            = "ElasticPool"
  zone_redundant      = false
  min_capacity = 1
  elastic_pool_id     = azurerm_mssql_elasticpool.sql_elastic_pool.id
  extended_auditing_policy {
    storage_endpoint                        = azurerm_storage_account.working-storage_account.primary_blob_endpoint
    storage_account_access_key              = azurerm_storage_account.working-storage_account.primary_access_key
    storage_account_access_key_is_secondary = true
    retention_in_days                       = 6
  }
  depends_on = [azurerm_mssql_elasticpool.sql_elastic_pool]
}

jwshive commented 4 years ago

Has the service team managed to look at the above transaction ids?

yupwei68 commented 4 years ago

Hi @jwshive I could reproduce this error. I have opened an ICM ticket for it. Please expect a response soon.

jwshive commented 4 years ago

Fantastic. Do you all know of any work arounds or is everyone stuck until this gets resolved? Seems odd that nobody using terraform on azure is able to create a database.

yupwei68 commented 4 years ago

Hi @jwshive This error is not raised from mssql database. It's raised from the sql auditing. If you could change the config of the storage account, the following could work :

provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "resource_group" {
  location = "eastus"
  name     = "agency-dept-tst-app-rg"
}

resource "azurerm_storage_account" "working-storage_account" {
  name                     = "agencydeptapptstwksa"
  resource_group_name      = azurerm_resource_group.resource_group.name
  location                 = "eastus"
  account_tier             = "Standard"
  account_replication_type = "RAGRS"
  account_kind             = "StorageV2"
}

resource "azurerm_mssql_server" "sql_server" {
  name                         = "agency-dept-tst-app-sqlsvr"
  resource_group_name          = azurerm_resource_group.resource_group.name
  location                     = "eastus"
  version                      = "12.0"
  administrator_login          = "..."
  administrator_login_password = "..."
}

resource "azurerm_mssql_elasticpool" "sql_elastic_pool" {
  name                = "agency-dept-tst-app-epool"
  location            = "eastus"
  resource_group_name = azurerm_resource_group.resource_group.name
  server_name         = azurerm_mssql_server.sql_server.name
  max_size_gb         = 4.8828125

  per_database_settings {
    min_capacity = 0
    max_capacity = 5
  }

  sku {
    capacity = 50
    name     = "BasicPool"
    tier     = "Basic"
  }
}

resource "azurerm_mssql_database" "sql_database" {
  name            = "agency-dept-tst-app-db"
  server_id       = azurerm_mssql_server.sql_server.id
  collation       = "SQL_Latin1_General_CP1_CI_AS"
  license_type    = "BasePrice"
  read_scale      = false
  sku_name        = "ElasticPool"
  zone_redundant  = false
  min_capacity    = 1
  elastic_pool_id = azurerm_mssql_elasticpool.sql_elastic_pool.id
  extended_auditing_policy {
    storage_endpoint                        = azurerm_storage_account.working-storage_account.primary_blob_endpoint
    storage_account_access_key              = azurerm_storage_account.working-storage_account.primary_access_key
    storage_account_access_key_is_secondary = true
    retention_in_days                       = 6
  }
}

jwshive commented 4 years ago

So the problem is my storage account being hns enabled or no public access. I can test with that. Thanks.

Edit: confirmed, it's the is_hns_enabled = true that stops things. That's unfortunate. Thanks for helping identify the issue.

yupwei68 commented 4 years ago

@jwshive The service team has replied that according to https://docs.microsoft.com/en-us/azure/azure-sql/database/auditing-overview#auditing-limitations, hns is not enabled for sql auditing. And they will enrich the error message soon.

katbyte commented 3 years ago

I'm going to close this as its been nearly a year, please do let us know if this is still an issue for you!

github-actions[bot] commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform-provider-azurerm