hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.6k stars 4.64k forks source link

Azurerm | CosmosDB | Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded #19455

Open andigwandi opened 1 year ago

andigwandi commented 1 year ago

Is there an existing issue for this?

Community Note

Terraform Version

1.2.9

AzureRM Provider Version

3.22.0

Affected Resource(s)/Data Source(s)

azurerm_cosmosdb_sql_container

Terraform Configuration Files

module "cosmos_db_container_master_data" {
  source     = "../shared/cosmos/container"
  depends_on = [module.cosmos_db]

  env_config = local.env_config
  container_config = {
    container_name         = "MasterData"
    account_name           = module.cosmos_db.db_config.account_name
    db_id                  = module.cosmos_db.db_config.db_id
    db_name                = module.cosmos_db.db_config.db_name
    connection_string      = module.cosmos_db.db_config.connection_strings[0]
    primary_key            = module.cosmos_db.db_config.primary_key
    read_endpoint          = module.cosmos_db.db_config.read_endpoints[0]
    write_endpoint         = module.cosmos_db.db_config.write_endpoints[0]
    partition_key_version  = local.cosmos_container_master_data_config.partition_key_version
    throughput             = local.cosmos_container_master_data_config.throughput
    default_ttl            = local.cosmos_container_master_data_config.default_ttl
    analytical_storage_ttl = local.cosmos_container_master_data_config.analytical_storage_ttl
    partition_key_path     = local.cosmos_container_master_data_config.partition_key_path
    autoscale_settings     = local.cosmos_container_master_data_config.autoscale_settings
  }

  indexing_policy = [{
    excluded_path = [{
      path = "/*"
    }]
    included_path = [{
      path = "/type/?"
    }]
    indexing_mode = "consistent"
  }]
}

Debug Output/Panic Output

I am seeing different errors related to 'context deadline' while re-running the same pipeline

╷
│ Error: reading CosmosDB Account "azr-ps2-cdb-dev10-r1" (Resource Group "azr-ps2-rg-01-r1"): documentdb.DatabaseAccountsClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
│ 
│   with module.ps2.module.cosmos_db_container_master_data.azurerm_cosmosdb_sql_container.container,
│   on ../modules/shared/cosmos/container/main.tf line 7, in resource "azurerm_cosmosdb_sql_container" "container":
│    7: resource "azurerm_cosmosdb_sql_container" "container" {

###################################

╷
│ Error: reading Throughput on Cosmos SQL Container PendingSalesTransactions (Account: "azr-ps2-cdb-dev10-r1", Database: "ps2") ID: documentdb.SQLResourcesClient#GetSQLContainerThroughput: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
│ 
│   with module.ps2.module.cosmos_db_container_pending_sales_transactions.azurerm_cosmosdb_sql_container.container,
│   on ../modules/shared/cosmos/container/main.tf line 7, in resource "azurerm_cosmosdb_sql_container" "container":
│    7: resource "azurerm_cosmosdb_sql_container" "container" {

Expected Behaviour

Terraform Plan should generate the changes in the resources without any issue

Actual Behaviour

Stage: Terraform Plan

I am seeing different errors related to the 'context deadline' while re-running the same pipeline. Both errors are around cosmos db and after 2-3 retries it is proceeding further.

│ Error: reading CosmosDB Account "azr-ps2-cdb-dev10-r1" (Resource Group "azr-ps2-rg-01-r1"): documentdb.DatabaseAccountsClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
│ 
│   with module.ps2.module.cosmos_db_container_master_data.azurerm_cosmosdb_sql_container.container,
│   on ../modules/shared/cosmos/container/main.tf line 7, in resource "azurerm_cosmosdb_sql_container" "container":
│    7: resource "azurerm_cosmosdb_sql_container" "container" {
│ Error: reading Throughput on Cosmos SQL Container PendingSalesTransactions (Account: "azr-ps2-cdb-dev10-r1", Database: "ps2") ID: documentdb.SQLResourcesClient#GetSQLContainerThroughput: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
│ 
│   with module.ps2.module.cosmos_db_container_pending_sales_transactions.azurerm_cosmosdb_sql_container.container,
│   on ../modules/shared/cosmos/container/main.tf line 7, in resource "azurerm_cosmosdb_sql_container" "container":
│    7: resource "azurerm_cosmosdb_sql_container" "container" {

Steps to Reproduce

No response

Important Factoids

No response

References

No response

sinbai commented 1 year ago

@andigwandi thanks for opening this issue here. Could you provided the raw Terraform config and repro steps as terraform module is not enough for reproduction and troubleshooting?

Beside, since the timeouts could be defined in tf config as follows, could you update the reading timeout( e.g. extend the timeout for reading to 20m) to see if that fixes the issue?

resource "azurerm_cosmosdb_sql_container" "container" {
...
...
...

timeouts {
    read = "20m"
  }
}
sam-h-bean commented 1 year ago

This has been happening to me as well. There definitely seems to have been some regression with refreshing the state of Cosmos infrastructure via Terraform.

I've been seeing errors like

Error: [0m Error: [ERROR] Unable to List connection strings for CosmosDB Account my-account: documentdb.DatabaseAccountsClient#ListConnectionStrings: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
andigwandi commented 1 year ago

@andigwandi thanks for opening this issue here. Could you provided the raw Terraform config and repro steps as terraform module is not enough for reproduction and troubleshooting?

Beside, since the timeouts could be defined in tf config as follows, could you update the reading timeout( e.g. extend the timeout for reading to 20m) to see if that fixes the issue?

resource "azurerm_cosmosdb_sql_container" "container" {
...
...
...

timeouts {
    read = "20m"
  }
}

Here is the configuration for the example given in the issue:

cosmos_container_config_master_data = { analytical_storage_ttl = -1 autoscale_enabled = false autoscale_settings = [{ max_throughput = 1000 }] throughput = 400 }

other configurations can be hardcoded like name, db_name etc.

This error comes when I execute terraform plan to generate the changes.

andigwandi commented 1 year ago

This has been happening to me as well. There definitely seems to have been some regression with refreshing the state of Cosmos infrastructure via Terraform.

I've been seeing errors like

Error: [0m Error: [ERROR] Unable to List connection strings for CosmosDB Account my-account: documentdb.DatabaseAccountsClient#ListConnectionStrings: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded

I also received similar kind of error for one of my pipeline as well:

Error: [ERROR] Unable to List read-only keys for CosmosDB Account my-account: documentdb.DatabaseAccountsClient#ListReadOnlyKeys: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded
philspencer-owd commented 4 months ago

I am oddly getting this with azurerm_security_center_contact and the error: retrieving Contact: (Security Contact Name "Platform Team"): security.ContactsClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded

prietolu commented 4 months ago

On Mon June 24th I started to get the same error message with azurerm_security_center_contact when running the same Terraform script on different Azure subscriptions that host different environments ( PROD, PREPROD, etc ):

Error: Reading Security Center Contact: security.ContactsClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded

However, it seems that today Wed 26th is working fine again , and we stopped receiving that error message for every environment .

It looks like there was a temporary problem retrieving information for azurerm_security_center_contact resources . Please, @philspencer-owd , can you confirm if you´re still having this error message today ?

philspencer-owd commented 4 months ago

@prietolu Confirmed this is now working again for all our environments as well!

usr122 commented 2 months ago

We are now frequently getting a similar error:

"Error: making Read request on AzureRM Application Insights Billing Feature '': insights.ComponentCurrentBillingFeaturesClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded"

We've seen it happen on Azure SQL database and app service resources.

@andigwandi - did increasing the read timeout get you past the error you were seeing?

subria11 commented 2 months ago

@usr122 We had the same problem from 22/08/2024 till 26/08/2024, after that it has been resolved automatically. So it was temporal issue from Azure side.

pierluca commented 2 months ago

We're still seeing this issue in Azure. Has anybody found a workaround ?

jaredbrogan commented 2 months ago

We are now frequently getting a similar error:

"Error: making Read request on AzureRM Application Insights Billing Feature '': insights.ComponentCurrentBillingFeaturesClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded"

We've seen it happen on Azure SQL database and app service resources.

@andigwandi - did increasing the read timeout get you past the error you were seeing?

Seeing this as a growing trend as well. Someone needs to look into the larger issue going on here.

subria11 commented 2 months ago

Issue created here: https://github.com/hashicorp/terraform-provider-azurerm/issues/27248

gctrevino commented 2 months ago

This started happening for us since Sept 5th, 2024...

Error: making Read request on AzureRM Application Insights Billing Feature 'appi-projectname-env': insights.ComponentCurrentBillingFeaturesClient#Get: Failure sending request: StatusCode=504 -- Original Error: context deadline exceeded