hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.58k stars 4.62k forks source link

Terraform -replace times out fast for azurerm_mssql_database resource and custom timeouts not honoured on all resources #25621

Open Aeropher opened 5 months ago

Aeropher commented 5 months ago

Is there an existing issue for this?

Community Note

Terraform Version

1.8.0

AzureRM Provider Version

3.99.0

Affected Resource(s)/Data Source(s)

azurerm_mssql_database

Terraform Configuration Files

# Assuming you already have a Resource Group and an azurerm_mssql_server resource created.
# This terraform is run using the Terraform CLI:
# CD to the directory with this file and run "terraform apply --auto-approve"

terraform {
  cloud {
    organization = "MyOrganization"
    hostname     = "app.terraform.io"

    workspaces {
      name = "terraform-bug-report"
    }
  }

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "=3.99.0"
    }
  }
}

provider "azurerm" {
  features {}
}

# Instantiate SQL Server data source
data "azurerm_mssql_server" "mssql-server" {
  name                = "myServerName"
  resource_group_name = "myRGName"
}

# Create a new database
resource "azurerm_mssql_database" "mynew-db" {
  server_id                   = data.azurerm_mssql_server.mssql-server.id
  name                        = "mynew_testdb"
  collation                   = "SQL_Latin1_General_CP1_CI_AS"
  auto_pause_delay_in_minutes = 60
  min_capacity                = 1
  max_size_gb                 = 30
  sku_name                    = "GP_S_Gen5_8"
  read_scale                  = false
  zone_redundant              = false
  geo_backup_enabled          = true
  storage_account_type        = "Geo"

  lifecycle {
    ignore_changes = [auto_pause_delay_in_minutes, min_capacity]
  }

  # same result with and without this block as it does not seem to be implemented despite documentation
  timeouts {
    create = "60m"
    update = "60m"
    read   = "10m"
    delete = "60m"
  }
}

Debug Output/Panic Output

- After running "terraform apply -replace='azurerm_mssql_database.mynew-db' --auto-approve":

azurerm_mssql_database.mynew-db: Destroying... [id=/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Sql/servers/myServerName/databases/mynew_testdb] 
azurerm_mssql_database.mynew-db: Still destroying... [10s elapsed]
azurerm_mssql_database.mynew-db: Destruction complete after 11s
azurerm_mssql_database.mynew-db: Creating...
azurerm_mssql_database.mynew-db: Still creating... [10s elapsed]
azurerm_mssql_database.mynew-db: Still creating... [20s elapsed]
azurerm_mssql_database.mynew-db: Still creating... [30s elapsed]
azurerm_mssql_database.mynew-db: Still creating... [40s elapsed]
azurerm_mssql_database.mynew-db: Still creating... [50s elapsed]
azurerm_mssql_database.mynew-db: Still creating... [1m0s elapsed]
╷
│ Error: waiting for Database (Subscription: "redacted"
│ Resource Group Name: "redacted"
│ Server Name: "myServerName"
│ Database Name: "mynew_testdb") to become ready: polling for the status of Database (Subscription: "redacted"
│ Resource Group Name: "redacted"
│ Server Name: "myServerName"
│ Database Name: "mynew_testdb"): unexpected status 404 (404 Not Found) with error: ResourceNotFound: The Resource 'Microsoft.Sql/servers/myServerName/databases/mynew_testdb' under resource group 'redacted' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix
│
│   with azurerm_mssql_database.mynew-db,
│   on trimmedCode.tf line 32, in resource "azurerm_mssql_database" "mynew-db":
│   32: resource "azurerm_mssql_database" "mynew-db" {
│
╵
Operation failed: failed running terraform apply (exit 1)

- After running "terraform apply --auto-approve" once the above error has occurred:

azurerm_mssql_database.mynew-db: Creating...
╷
│ Error: A resource with the ID "/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Sql/servers/myServerName/databases/mynew_testdb" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_mssql_database" for more information.
│
│   with azurerm_mssql_database.mynew-db,
│   on trimmedCode.tf line 32, in resource "azurerm_mssql_database" "mynew-db":
│   32: resource "azurerm_mssql_database" "mynew-db" {
│
╵
Operation failed: failed running terraform apply (exit 1)

Expected Behaviour

Terraform will delete the resource and then create it again.

Terraform custom timeouts will be honoured.

Actual Behaviour

Steps to Reproduce

Using the terraform CLI in PowerShell or CMD, create the resources so that they already exist:

terraform apply --auto approve

Once the database resource exists run the Terraform replace command line and reference the terraform ID of the resource to replace. If you have an index in your resource ID then you must provide the quotes and escape them with a \ character. If using the code provided above then this will work:

"terraform apply -replace='azurerm_mssql_database.mynew-db' --auto-approve"

Important Factoids

No response

References

This issue looks like it has existed for some time. In the post below you can see that they are using Terraform taint but that has been deprecated.

https://github.com/hashicorp/terraform-provider-azurerm/issues/9101

sinbai commented 5 months ago

Hi @Aeropher thanks for opening this issue. However, I would like to clarify that Terraform implements CRUD management of azure resources through the Azure Rest API. The above error is actually returned by Azure Rest API, not Terraform.

For more details, please refer to the logs below extracted from TF Log after running "terraform apply -replace='azurerm_mssql_database.mynew-db' --auto-approve".

AzureRM Request: 
DELETE /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb?api-version=2023-02-01-preview HTTP/1.1
Host: management.azure.com
User-Agent: HashiCorp/go-azure-sdk (Go-http-Client/1.1 databases/2023-02-01-preview) HashiCorp Terraform/1.6.5 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/3.99.0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
Content-Type: application/json; charset=utf-8
X-Ms-Correlation-Request-Id: 1a1ad13e-5dca-2dc2-091d-6cb86bfa0110
Accept-Encoding: gzip: timestamp="2024-04-16T15:45:57.660+0800"

AzureRM Response: 
HTTP/2.0 202 Accepted
...

{"operation":"DropLogicalDatabase","startTime":"2024-04-16T07:46:00.197Z"}: timestamp="2024-04-16T15:46:00.489+0800"
AzureRM Request: 
GET /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb?api-version=2023-02-01-preview HTTP/1.1

AzureRM Response:
HTTP/2.0 404 Not Found
...

{"error":{"code":"ResourceNotFound","message":"The requested resource of type 'Microsoft.Sql/servers/databases' with name 'mynew_testdb' was not found."}}: timestamp="2024-04-16T15:46:12.429+0800"
AzureRM Request: 
GET /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb/replicationLinks?api-version=2021-02-01-preview HTTP/1.1

AzureRM Response:
HTTP/2.0 200 OK
...

{"value":[]}: timestamp="2024-04-16T15:46:17.131+0800"
AzureRM Request: 
PUT /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb?api-version=2023-02-01-preview HTTP/1.1
Host: management.azure.com
User-Agent: HashiCorp/go-azure-sdk (Go-http-Client/1.1 databases/2023-02-01-preview) HashiCorp Terraform/1.6.5 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/3.99.0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
Content-Length: 613
Content-Type: application/json; charset=utf-8
X-Ms-Correlation-Request-Id: 1a1ad13e-5dca-2dc2-091d-6cb86bfa0110
Accept-Encoding: gzip

{"location":"eastus2","properties":{"autoPauseDelay":60,"collation":"SQL_Latin1_General_CP1_CI_AS","createMode":"Default","elasticPoolId":"","encryptionProtectorAutoRotation":false,"highAvailabilityReplicaCount":0,"isLedgerOn":false,"licenseType":"","maintenanceConfigurationId":"/subscriptions/"redacted"/providers/Microsoft.Maintenance/publicMaintenanceConfigurations/SQL_Default","maxSizeBytes":32212254720,"minCapacity":1,"readScale":"Disabled","requestedBackupStorageRedundancy":"Geo","sampleName":"","secondaryType":"","zoneRedundant":false},"sku":{"name":"GP_S_Gen5_8"},"tags":{}}: timestamp="2024-04-16T15:46:17.132+0800"

AzureRM Response:
HTTP/2.0 202 Accepted
...

{"operation":"CreateLogicalDatabase","startTime":"2024-04-16T07:46:19.22Z"}: timestamp="2024-04-16T15:46:19.351+0800"
AzureRM Request: 
GET /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/locations/eastus2/databaseAzureAsyncOperation/355517da-b363-4963-b66d-14e61bf7aa7b?api-version=2023-02-01-preview&t=... HTTP/1.1
AzureRM Response:
HTTP/2.0 200 OK

...

{"name":"355517da-b363-4963-b66d-14e61bf7aa7b","status":"Succeeded","startTime":"2024-04-16T07:46:19.22Z"}: timestamp="2024-04-16T15:47:26.376+0800"
AzureRM Request: 
GET /subscriptions/"redacted"/resourceGroups/exampleRG25621-0416/providers/Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb?api-version=2023-02-01-preview HTTP/1.1

AzureRM Response:
HTTP/2.0 404 Not Found
...

{"error":{"code":"ResourceNotFound","message":"The Resource 'Microsoft.Sql/servers/sqlserver-25621-0416/databases/mynew_testdb' under resource group 'exampleRG25621-0416' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"}}: timestamp="2024-04-16T15:47:27.413+0800"

In step 5, the status information that mynew_testdb was successfully re-created in step 4 has been returned. However, when getting mynew_testdb in step 6, the API returns that the mynew_testdb created again in step 4 does not exist. I assume that this is an Azure Rest API issue. It is recommended that you could create an issue in this repository to report the issue.

Aeropher commented 5 months ago

Hi, thanks so much for looking into this. So I have a few questions to make sure I am understanding this correctly.

I notice that after the initial DELETE there are 2 GET requests (steps 2&3). I assume these are the polling requests that check to see if the delete was successful. I also notice that there are 2 GET requests after the PUT request (steps 5&6) which I assume are to check to see if the creation of the database was successful.

So I can see that if the 1st GET (step 5) after the PUT is successful then one would think that the 2nd GET (step 6) would also be successful then fails which points to an issue with the Azure Response.

1) How come we do 2 GET requests and what do they do? I think I would need to spell this out in a bug report in the other repo.

2) Should we not keep making the second request until it is either successful or until we hit the timeout?

3) To be fair I think this might be a separate bug but I think the timeouts are not being respected in the logic around these API calls, is that the case?

4) How did you get those logs, they looks so useful!