hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.52k stars 4.6k forks source link

auto_pause_delay_in_minutes property has value of 0 when disabled, not -1 #27241

Open rit-bit opened 2 weeks ago

rit-bit commented 2 weeks ago

Is there an existing issue for this?

Community Note

Terraform Version

1.7.5

AzureRM Provider Version

3.108.0

Affected Resource(s)/Data Source(s)

azurerm_mssql_database

Terraform Configuration Files

resource "azurerm_resource_group" "this" {
  name     = "example-rg"
  location = "UK South"
}

resource "azurerm_mssql_server" "example" {
  name                          = "example-server"
  resource_group_name           = azurerm_resource_group.this.name
  location                      = "UK South"
  version                       = "12.0"
  administrator_login           = var.administrator_login
  administrator_login_password  = var.administrator_login_password
  public_network_access_enabled = true
  minimum_tls_version           = "1.2"

  lifecycle {
    prevent_destroy = true
  }
}

resource "azurerm_mssql_database" "example" {
  name                 = "example-db"
  server_id            = azurerm_mssql_server.example.id
  collation            = "SQL_Latin1_General_CP1_CI_AS"
  license_type         = "LicenseIncluded"
  sku_name             = "HS_S_Gen5_2"
  storage_account_type = "Local"

  auto_pause_delay_in_minutes = -1

  short_term_retention_policy {
    retention_days = 7
  }

  long_term_retention_policy {
    weekly_retention  = "P2M"
    monthly_retention = "P6M"
    yearly_retention  = "P2Y"
    week_of_year      = 1
  }

  lifecycle {
    prevent_destroy = true
  }
}

Debug Output/Panic Output

n/a
(I'm happy to provide this if genuinely helpful but it seems that the problem is fairly apparent without the need for this)

Expected Behaviour

Re-running terraform plan should result in no changes being detected:

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Actual Behaviour

Re-running terraform plan results in 1 planned change:

Terraform will perform the following actions:

  # module.database.azurerm_mssql_database.database will be updated in-place
  ~ resource "azurerm_mssql_database" "database" {
      ~ auto_pause_delay_in_minutes                                = 0 -> -1
      ...
    }

Plan: 0 to add, 1 to change, 0 to destroy.

If I run terraform apply, terraform does seem to apply this change (taking 60s+) and it doesn't error. But running terraform plan again immediately afterwards still shows the above change.


If I instead specify auto_pause_delay_in_minutes as 0, then terraform correctly sees that there are no changes to make, but if I change the database SKU and run terraform apply, I get this error:

Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled.

Steps to Reproduce

  1. Run terraform apply to create the example resource group, server, and database
  2. Run terraform apply again

Important Factoids

No response

References

No response

sinbai commented 2 weeks ago

Hi @rit-bit thanks for raising this issue. First, I would like to clarify that the auto_pause_delay_in_minutes property is only settable for Serverless databases. Here is the description document for auto_pause_delay_in_minutes. Could you please confirm if it is set for Serverless databases? Also, I confirmed that when I disabled auto-pause, the value of autoPauseDelay is -1 in Azure Portal. Could you please double check? Or could you please provide explanation why you think it should be 0 instead of -1? image image

rit-bit commented 2 weeks ago

Hi @sinbai,

Yes, our dev database uses serverless, SKU HS_S_Gen5_2 (serverless Hyperscale, Gen5 hardware, max 2 vCores). I now realise that Hyperscale is relevant information here, since I now see that on the Compute & storage tab in the Azure portal it says Auto-pause is not currently supported in the Hyperscale tier, and indeed the property is not present in the Resource JSON:

image

In that case, the issue is actually that the azurerm provider requires the auto_pause_delay_in_minutes property to be set even when the SKU is Hyperscale which doesn't support auto-pausing. If I try to change the SKU of my database using terraform without specifying a value for auto_pause_delay_in_minutes then I get the same error mentioned above:

Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled.

could you please provide explanation why you think it should be 0 instead of -1?

It's not that I think it should be 0 instead of -1, I just tried that out because that's what terraform seemed to be reading the value as (though now I realised that this is because the value doesn't exist, see the resource JSON screenshot above).

Essentially, I don't want terraform to perform any actions changing the value of auto_pause_delay_in_minutes on my database every time I run terraform plan or terraform apply, and I don't want terraform to error. There is no value of auto_pause_delay_in_minutes (including omitting it entirely) that achieves this:

  1. not setting auto_pause_delay_in_minutes - this results in the above terraform error
  2. set auto_pause_delay_in_minutes = 0 - this results in the same terraform error as (1)
  3. set auto_pause_delay_in_minutes = -1 - this results in spurious changes, as terraform seems to think the existing auto_pause_delay_in_minutes value is 0

I think this is a bug in the terraform provider - for a Hyperscale database, it should not error when auto_pause_delay_in_minutes is not set.

sinbai commented 1 week ago

Hi @rit-bit thanks for your explanation. I would like to explain that the value of auto_pause_delay_in_minutes is set to 0 in state file because it is an int variable, and the default of int is 0 even though it is not specified in the config. Also, is it possible to provide the complete tf (including variable values) configs and repro steps to help reproduce the following error and troubleshoot?

Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled.

rit-bit commented 1 week ago

Hi @sinbai,

In the process of creating config to reproduce the issue, I realised that the issue has been occurring on an already-provisioned database, so I thought I'd check whether the issue is also present when spinning up a new database.

It turns out that the issue is not present if you spin up a Hyperscale serverless database, but the issue does occur if you spin up a Hyperscale provisioned database and then transition it to the serverless tier.

The steps to reproduce are:

  1. Run terraform apply
  2. Change the sku_name from "HS_Gen5_2" to "HS_S_Gen5_2"
  3. Run terraform apply again. You should encounter the following error:
    Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled.
  4. Uncomment the auto_pause_delay_in_minutes property and run terraform apply again. You should find that the apply is successful.
  5. Run terraform apply again and you should see terraform wants to update the auto_pause_delay_in_minutes property from 0 -> -1. This is the spurious change that shouldn't need to happen every time terraform apply is run.

Using this config:

resource "azurerm_resource_group" "this" {
  name     = "example-rg"
  location = "UK South"
}

resource "azurerm_mssql_server" "example" {
  name                          = "example-server"
  resource_group_name           = azurerm_resource_group.this.name
  location                      = "UK South"
  version                       = "12.0"
  administrator_login           = var.administrator_login
  administrator_login_password  = var.administrator_login_password
  public_network_access_enabled = true
  minimum_tls_version           = "1.2"

  lifecycle {
    prevent_destroy = true
  }
}

resource "azurerm_mssql_database" "example" {
  name         = "${var.prefix}-example"
  server_id    = azurerm_mssql_server.server.id
  collation    = "SQL_Latin1_General_CP1_CI_AS"
  license_type = "LicenseIncluded"

  sku_name                    = "HS_Gen5_2"  # Change this to "HS_S_Gen5_2" in step 2
  storage_account_type        = "Local"
# auto_pause_delay_in_minutes = -1  # Uncomment this line in step 4

  short_term_retention_policy {
    retention_days = 7
  }

  long_term_retention_policy {
    weekly_retention  = "P2M"
    monthly_retention = "P6M"
    yearly_retention  = "P2Y"
    week_of_year      = 1
  }
}
sinbai commented 1 week ago

Hi @rit-bit thanks for your reply. I reproduced the mentioned error.

First, I would like to explain that Terraform manages Azure resource through the Azure Rest API. However, per the Terraform log below, we can see that the error is returned by Azure Rest API, not Terraform,even though Terraform didn't send any value for auto_pause_delay_in_minutes in step 3.

PATCH /subscriptions/.../resourceGroups/example-rg-2741/providers/Microsoft.Sql/servers/example-server-27241/databases/example-example27241?api-version=2023-02-01-preview HTTP/1.1

{"properties":{},"sku":{"name":"HS_S_Gen5_2"}}: timestamp="2024-09-03T14:15:05.415+0800"

{"name":"618d6869-cfa5-4126-bca0-e0ed016dca4f","status":"Failed","startTime":"2024-09-03T06:15:08.257Z","error":{"code":"UpdateToServerlessIfLtrIsNotDisabled","message":"Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled."}}: timestamp="2024-09-03T14:15:23.803+0800"

For the error in step 3, as set in step 4 above, we can solve this issue by disabling auto_pause_delay_in_minutes.

For the symptoms in step 5 above, this is because the Azure Rest API doesn't return the value of auto_pause_delay_in_minutes in step 4. For this case, on Terraform side, we recomment setting ignore_changes as shown below to avoid this situation.

 lifecycle {
    ignore_changes = [auto_pause_delay_in_minutes]
  }

Thank you again for raising this issue.

rit-bit commented 1 week ago

Hi @sinbai

I'm pleased you were able to reproduce the error.

Regarding your suggestion to use the ignore_changes lifecycle property to prevent the symptom from step 5: this is something I have tried and it does work for cases where you are not trying to change anything on the database - terraform correctly identifies that it shouldn't change any properties on the database resource.

If however I am using the ignore_changes lifecycle property as above and I want to change the SKU for example from "HS_S_Gen5_2" to "HS_S_Gen5_4", then terraform encounters the same error as before:

Enabling auto-pause for a serverless database is not supported if long-term backup retention is enabled.

If I remove the ignore_changes lifecycle property (and keep the line auto_pause_delay_in_minutes = -1) then the apply succeeds, but I have to re-add the ignore_changes lifecycle property afterwards to prevent the symptom from step 5 surfacing again.

It's undesirable and very impractical to have to add or remove the ignore_changes lifecycle property whenever a change is to be made to the database resource.

sinbai commented 1 week ago

Yes, it is indeed what you said. Another way is to manually modify the auto_pause_delay_in_minutes in the state file to -1 after step 4, or submit an issue in the Azure Rest API repo, requesting the API to return the specified value in the response if auto_pause_delay_in_minutes is specified in the API request. Apart from this, as far as I know, there is no better solution.