hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.49k stars 4.59k forks source link

Azure KeyVault Error: Provider produced inconsistent result after apply #11059

Open GregBillings opened 3 years ago

GregBillings commented 3 years ago

Community Note

Terraform (and AzureRM Provider) Version

Terraform v0.14.8 AzureRM: terraform-provider-azurerm_v2.52.0_x5

Affected Resource(s)

Terraform Configuration Files

resource "azurerm_key_vault_secret" "mysecretvalue" {
  name         = "secretvaluename"
  value        = var.some_value_from_var
  key_vault_id = data.terraform_remote_state.remote_terraform_cloud_state.outputs.key_vault_id
}

Debug Output

Error: Provider produced inconsistent result after apply

When applying changes to azurerm_key_vault_secret.mysecretvalue, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

This is a bug in the provider, which should be reported in the provider's own issue tracker.

Expected Behaviour

secret added to the keyvault

Actual Behaviour

The keyvault secret was added to the keyvault with the correct value, but the terraform apply failed with the error above. When re-running again, the new error is that the value already exists but isn't tracked in the terraform state

Steps to Reproduce

  1. Create a terraform script that creates an azure keyvault and then outputs the ID as an output variable
  2. Create another terraform script with a different remote state that pulls in the first remote state via:

    data "terraform_remote_state" "remotestate" {
    backend = "remote"
    
    config = {
    organization = "my-org"
    workspaces = {
      name = "first-remote-state"
    }
    }
    }
  3. Attempt to create a new keyvault secret using the id from the output of the first remote state:
    resource "azurerm_key_vault_secret" "mysecretvalue" {
    name         = "MySecretValue"
    value        = var.some_value_from_var
    key_vault_id = data.terraform_remote_state.remotestate.outputs.key_vault_id
    }

References

KillianW commented 3 years ago

This Issue is plaguing my pipelines at the moment. Is it possibly related to a similar underlying caching issue as #10602 ?

dkirrane commented 3 years ago

I had raised this previously here https://github.com/terraform-providers/terraform-provider-azurerm/issues/10227. Still seeing it with Terraform v0.14.9 & hashicorp/azurerm 2.56.0

If I re-run Terrafrom plan/apply I hit

Error: A resource with the ID "https://my-kv.vault.azure.net/secrets/my-secret/12345b12345cd4cd18c12345edb3c3cd" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_key_vault_secret" for more information.
kousourakis commented 3 years ago

Have the same issue with azurerm v2.71 terraform 1.0.4

deleted the remote state and recreated from scratch and it worked fine. I previously upgraded from 2.70 -> 2.71 and terraform 1.0.1 -> 1.0.4

sinbai commented 2 years ago

@GregBillings I am investigating this issue, but I can't repro it by following the steps below. Could you help confirm whether the following steps can repro this issue on your side? In addition, I would like to confirm whether you have done other operations before doing the following? Any detail information is greatly appreciated.

Steps to Reproduce

  1. Create two new terraform cloud workspaces(ReproBug and ReproBug1).

  2. Creates an azure keyvault on ReproBug workspace and then outputs the ID as an output variable via:

    terraform {
      required_providers {
        azurerm = {
          source = "hashicorp/azurerm"
         version = "2.52.0"
        }
      }
    
       backend "remote" {
       organization = "my-org-elena"
        workspaces {
          name = "ReproBug"
        }
     }
    }
    
    provider "azurerm" {
      features {}
    }
    
    resource "azurerm_resource_group" "test" {
      name     = "myTFResourceGroup"
      location = "westus2"
    }
    
    resource "azurerm_key_vault" "test" {
      name                       = "acctestkv-elena05"
      location                   = azurerm_resource_group.test.location
      resource_group_name        = azurerm_resource_group.test.name
      tenant_id                  = "XXXXXXXXX"
      sku_name                   = "standard"
      soft_delete_retention_days = 7
    
      access_policy {
        tenant_id = "XXXXXXX"
        object_id = "XXXXXXX"
    
        key_permissions = [
          "Get",
         "Delete",
        ]
    
        secret_permissions = [
          "Get",
          "Delete",
          "List",
          "Purge",
          "Recover",
          "Set",
        ]
      }
    
      tags = {
        environment = "Production"
      }
    }
    
    output "key_vault_id" {
      value = azurerm_key_vault.test.id
    }
  3. create a new keyvault secret on ReproBug1 workspace using the id from the output of the first remote state via:

    terraform {
      required_providers {
        azurerm = {
          source = "hashicorp/azurerm"
         version = "2.52.0"
        }
      }
    
       backend "remote" {
       organization = "my-org-elena"
        workspaces {
          name = "ReproBug1"
        }
     }
    }
    
    provider "azurerm" {
      features {}
    }
    
    data "terraform_remote_state" "remotestate" {
      backend = "remote"
    
      config = {
        organization = "my-org-elena"
        workspaces = {
          name = "ReproBug"
        }
      }
    }
    
    resource "azurerm_key_vault_secret" "mysecretvalue" {
      name         = "MySecretValue2"
      value        = "test"
      key_vault_id = data.terraform_remote_state.remotestate.outputs.key_vault_id
    }

    Note: Remote state is managed by Terraform Cloud, issue can't be reproduced when Execution Mode is remote or local.

My Terraform (and AzureRM Provider) Version

Untitled
sinbai commented 2 years ago

Have the same issue with azurerm v2.71 terraform 1.0.4

deleted the remote state and recreated from scratch and it worked fine. I previously upgraded from 2.70 -> 2.71 and terraform 1.0.1 -> 1.0.4

@elthanor I can't repro this issue, below is my repro steps. Could you provide more details to help me repro this issue? 1.Create two new terraform cloud workspaces(ReproBug1 and ReproBug2). 2.Create keyvault with azurerm v2.70 and terraform 1.0.1 on ReproBug1 workspace. 3.Update to azurerm provider to v2.7.1 and terraform to 1.0.4, add a new secret to the existing keyvault on workspace ReproBug2.

Note: Remote state is managed by Terraform Cloud, issue can't be reproduced when Execution Mode is remote or local.

mybayern1974 commented 2 years ago

I would like to Re sinbai's finding that I cannot repro this issue either. The app I used is TF v1.0.5 and AzureRM provider v2.74 (this is the latest Azure provider version when I'm typing now). What I tried is creating a KV + KVSec first, and then use that created KV as a data source and create another KVSec. All the above was done by TF and all of them worked well w/o getting any command line error msg. I admit I just tried them w/ local state rather than remote state.

With noticing there are 20+ thump up to this issue, I do believe people ran into issues as described here. While to serve my or sinbai's troubleshooting, can anyone here provide more contexts in terms of step-by-step-repro-this-issue?

In addition, it would be helpful if below info can be provided:

  1. Does this issue only repro when using remote state rather than local state?
  2. Does this issue stable repro or happen intermittently?
  3. For people who ran into this issue, do you use TF to manage all things (KV, KVSec, etc.) rather than using any other client tools (portal, CLI, etc) to co-manage resources?
  4. For people who ran into this issue, is it possible there were someone-else/some-other-client-tooling manipulating the same resource (KV) at the same time when you used TF to manage that? If so, might this symptom be a result of conflict-manipulation-against-the-same-KV?
MaksymChornyi commented 2 years ago

@mybayern1974 Terraform 1.0.0 azurerm 2.71.0

1) I use backend "azurerm" 2) The problem occurs in ~70% of attempts 3) KeyVault has been created separately with the local state and I don't have this issue with the local state 4) We don't use KeyVault at the same moment

futureviperowner commented 2 years ago

When I ran into this issue, I was using a data resource to look up information about another key vault that was created outside of my terraform scripts. I was using the resource ID returned by the data source to create additional azurerm_key_vault_secret resources in that vault.

mybayern1974 commented 2 years ago

@viper4u , I still could not repro after using remote states (use backend "azurerm") and using TF/AzureRM version you specified ๐Ÿ˜•. With seeing you mentioned 70% repro rate, I ran things 10 times and all of them succeeded.

Below are my repro steps with .tf config

  1. Create a resource group on portal
  2. Create a KV by TF with using backend=azurerm and TF ver=1.0.0 and AzureRM ver=2.71.0. Below is my config ===kv.tf===
    
    terraform {
    backend "azurerm" {
    resource_group_name = "..."
    storage_account_name = "..."
    container_name = "..."
    key = "terraform1.tfstate"
    }
    required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.71"
    }
    }
    }

provider "azurerm" { features {} }

data "azurerm_client_config" "current" {}

data "azurerm_resource_group" "test" { name = "..." }

resource "azurerm_key_vault" "test" { name = "..." location = data.azurerm_resource_group.test.location resource_group_name = data.azurerm_resource_group.test.name sku_name = "standard" tenant_id = data.azurerm_client_config.current.tenant_id

access_policy { tenant_id = data.azurerm_client_config.current.tenant_id object_id = data.azurerm_client_config.current.object_id

key_permissions = [
  "create",
  "get",
  "list",
  "purge"
]

secret_permissions = [
  "set",
  "get",
  "delete",
  "purge",
  "recover",
  "list"
]

} }

===outputs.tf===

output "key_vault_id" { value = azurerm_key_vault.test.id }

4. Execute `terraform apply` => A KV got provisioned
5. Create a KVSec by TF. Below is my .tf config
===kvsec.tf===

data "terraform_remote_state" "test" { backend = "azurerm" config = { storage_account_name = "..." container_name = "..." resource_group_name = "..." key = "terraform1.tfstate" } }

resource "azurerm_key_vault_secret" "test" { name = "..." value = "..." key_vault_id = data.terraform_remote_state.test.outputs.key_vault_id }

6. Execute `terraform apply` => A KVSec got provisioned underneath the KV
7. Append below section to the `kvsec.tf`

resource "azurerm_key_vault_secret" "test2" { name = "..." value = "..." key_vault_id = data.terraform_remote_state.test.outputs.key_vault_id }


8. Execute `terraform apply` => Another KVSec got provisioned underneath the KV
10. Repeat step 7 - Step 8 for say 10 times => A bunch of KVSec got provisioned underneath the KV w/o seeing tf command line errors. 

@doug-papenthien-by , I also did similar things by using local state and follow your "_I was using the resource ID returned by the data source to create additional azurerm_key_vault_secret_". While I still could not repro.

@ all here, do you see any usage/config difference between yours and mine above?
sinbai commented 2 years ago

@mybayern1974 Terraform 1.0.0 azurerm 2.71.0

  1. I use backend "azurerm"
  2. The problem occurs in ~70% of attempts
  3. KeyVault has been created separately with the local state and I don't have this issue with the local state
  4. We don't use KeyVault at the same moment

@viper4u Thank you for your reply. Below are my repro steps. The issue can't be reproduced. Could you follow the stpes below to reproduce it? If not, could you provide the step-by-step to help me reproduce this issue?

My Terraform (and AzureRM Provider) Version Terraform 1.0.0 azurerm 2.71.0

Steps

  1. Create a resource group named myTestResourceGroup on Azure portal.
  2. Create a blob storage account named teststorageaccount on the Azure portal. (Stores the state as a Blob with the given Key within the Blob Container within the Blob Storage Account.)
  3. Create a blob container named blobcontainer on the Azure portal.
  4. Create a folder named ReproBug on the local machine and add a file named "step1.tf" in this folder.
  5. Add the following tfconfig in step1.tf to create a azurerm_key_vault(Authenticate using a SAS Token associated with the Storage Account) :

    
    terraform {
    required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
     version = "2.71.0"
    }
    }
    
    backend "azurerm" {
    storage_account_name = "teststorageaccount"
    container_name       = "blobcontainer"
    key                  = "prod.terraform.tfstate1"
    
    sas_token = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
    }
    }

provider "azurerm" { features {} }

resource "azurerm_key_vault" "test" { name = "acctestkv-elena0901" location = "westus2" resource_group_name = "myTestResourceGroup" tenant_id = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" sku_name = "standard" soft_delete_retention_days = 7

access_policy { tenant_id = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" object_id = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

key_permissions = [
  "Get",
  "Delete",
]

secret_permissions = [
  "Get",
  "Delete",
  "List",
  "Purge",
  "Recover",
  "Set",
]

}

tags = { environment = "Production" } }

output "key_vault_id" { value = azurerm_key_vault.test.id }

6. Run `terraform init`ใ€`terraform plan`ใ€ `terraform apply` to create azurerm_key_vault.
7. Add a file named "step2.tf" in `ReproBug` folder.
8. Add the following tfconfig in `step2.tf` to create a azurerm_key_vault_secret(Authenticate using a SAS Token associated with the Storage Account):

terraform { required_providers { azurerm = { source = "hashicorp/azurerm" version = "2.71.0" } }

backend "azurerm" { storage_account_name = "teststorageaccount" container_name = "blobcontainer" key = "prod.terraform.tfstate2"

sas_token = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

} }

provider "azurerm" { features {} }

data "terraform_remote_state" "remotestate" { backend = "azurerm"

config = { storage_account_name = "teststorageaccount" container_name = "blobcontainer" key = "prod.terraform.tfstate1"

sas_token = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

} }

resource "azurerm_key_vault_secret" "mysecretvalue" { name = "MySecretValue-0901" value = "test" key_vault_id = data.terraform_remote_state.remotestate.outputs.key_vault_id }



9. After successfully creating azurerm_key_vault, run `terraform init`ใ€`terraform plan`ใ€ `terraform apply` to create azurerm_key_vault_secret=>azurerm_key_vault_secret was created,no command line error msg 

10. Continue to add ten azurerm_key_vault_secrets with different `key` and different azurerm_key_vault_secret `name` to the azurerm_key_vault  which created in step 6.=>All of them worked well, There is no command line error msg in my test
MaksymChornyi commented 2 years ago

@sinbai @mybayern1974 Sorry for the late answer I tried to reproduce it with a simple script but I haven't faced the problem. I have a failed attempt with a trace log level. Where can I send it?

sinbai commented 2 years ago

@sinbai @mybayern1974 Sorry for the late answer I tried to reproduce it with a simple script but I haven't faced the problem. I have a failed attempt with a trace log level. Where can I send it?

@viper4u Mail log to sinbai@hotmail.com, thank you. I will try to see if there are any findings in the log, but I think step-by-step-repro-this-issue is the key to investigating this issue. Thank you very much.

MaksymChornyi commented 2 years ago

@sinbai I've reproduced it with a simple script. I have an Azure KeyVault and that issue is reproduced each time with that one How can I help you to with it?

sinbai commented 2 years ago

@sinbai I've reproduced it with a simple script. I have an Azure KeyVault and that issue is reproduced each time with that one How can I help you to with it?

Can I access that Azure KeyVault? And could you share your script (contains the access token) with me to reproduce it on my side?

MaksymChornyi commented 2 years ago

What did you mean by "access token"? I use Service Principal auth

MaksymChornyi commented 2 years ago

I've sent all details to your mail

sinbai commented 2 years ago

I've sent all details to your mail

Thank you. Per details provided, I cannot access the Azure KeyVault due to the access permissions.

MaksymChornyi commented 2 years ago

I've reproduced the issue with new keyvault Context:

Step to reproduce:

  1. I am from East Europe
  2. Create a new KeyVault in East US
  3. Run simplest terraform script

    terraform {
    
    backend "local" {
    }
    
    required_providers {
    
      azurerm = {
        source  = "hashicorp/azurerm"
        version = "~> 2.75.0"
      }
    
    }
    }
    provider "azurerm" {
    features {}
    }
    
    resource "azurerm_key_vault_secret" "test_key" {
    name         = "test"
    value        = "test"
    key_vault_id = "/subscriptions/<subscription_id>/resourceGroups/<rg-name>/providers/Microsoft.KeyVault/vaults/<vault_name>"
    }
  4. Secret has been created
  5. Terraform output is "Error: Provider produced inconsistent result after apply"
sinbai commented 2 years ago

@viper4u Per the log provided, I found the Azure keyVault cannot be found after setting the secret, Could you help confirm whether the keyvault exists after reproducing this issue?

It is best to confirm through the following link๏ผš https://docs.microsoft.com/en-us/rest/api/resources/resources/list#code-try-0

Notes:

  1. Update the API version with 2020-06-01.
  2. Add the optional parameter $filter with value resourceType eq 'Microsoft.KeyVault/vaults' and name eq '<kv name>' like this: getResource
  3. Try it several times to see if you can find it every time.

In addition, could you also check the total resources count of your subscription?

MaksymChornyi commented 2 years ago

Results 1) when I use the filter with "and name eq ''", then I get { "value": [], "nextLink": "<some_url>" } 2) when I use the filter without "and name eq ''", then I get list of resources with my KeyVault 3) When I try another Keyvault in another subscription then I get result without nextlink { "value": [ { "id": "/subscriptions/<my_subscription>/resourceGroups/<rg_name>/providers/Microsoft.KeyVault/vaults/<kv_name>", "name": "makctestkv3", "type": "Microsoft.KeyVault/vaults", "location": "eastus2" } ] }

sinbai commented 2 years ago

Results

  1. when I use the filter with "and name eq ''", then I get { "value": [], "nextLink": "<some_url>" }
  2. when I use the filter without "and name eq ''", then I get list of resources with my KeyVault
  3. When I try another Keyvault in another subscription then I get result without nextlink { "value": [ { "id": "/subscriptions/<my_subscription>/resourceGroups/<rg_name>/providers/Microsoft.KeyVault/vaults/<kv_name>", "name": "makctestkv3", "type": "Microsoft.KeyVault/vaults", "location": "eastus2" } ] }

Could you please share with me the screenshot of the above results and obscure sensitive information(Including parameter settings and respond body)?

sinbai commented 2 years ago

I submitted a PR that might solve this issue. Since I canโ€™t reproduce this problem until now, I canโ€™t verify it. Is anyone willing to build it and verify that this issue is fixed? The info to build a TF provider (on Windows) are as follows:

Requirements: Terraform version 0.12.x + (but 1.x is recommended) Go version 1.16.x (to build the provider plugin) Git Bash for Windows Make for Windows

Using the locally compiled Azure Provider binary: For example, add the following to terraform.rc for a provider binary located in D:\GoCode\bin. The file named terraform.rc and placed in the relevant user's %APPDATA% directory. image

Steps:

  1. Pull this PR
  2. Run make build in git bash.

After make build is completed, it will be used automatically when running terraform apply.

References: https://github.com/hashicorp/terraform-provider-azurerm/blob/main/README.md

MaksymChornyi commented 2 years ago

@sinbai We have tried your PR #13409 and it solves our problem

tplive commented 2 years ago

@sinbai I'm trying to confirm if this solved the problem, so I've created a mock tf setup, and I'm deploying to the same subscription using the same SP as where I had the problem, but running only the creation of the vault secret that fails in production. I'm getting the same error:

azurerm_key_vault_secret.domain-join-password: Creating...

โ”‚ Error: Provider produced inconsistent result after apply
โ”‚ 
โ”‚ When applying changes to azurerm_key_vault_secret.domain-join-password, provider "provider[\"registry.terraform.io/hashicorp/azurerm\"]" produced
โ”‚ an unexpected new value: Root resource was present, but now absent.
โ”‚ 
โ”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.

However, I'm new to running developer versions of the provider, so I'm unsure if I did it correctly. It's still downloading the azurerm 2.78 version, even though I'm warned that I'm using developer override. Is it supposed to still dl the prod version? ๐Ÿคท๐Ÿฝ

sinbai commented 2 years ago

@sinbai We have tried your PR #13409 and it solves our problem

Thank you for your kind cooperation.

sinbai commented 2 years ago

However, I'm new to running developer versions of the provider, so I'm unsure if I did it correctly. It's still downloading the azurerm 2.78 version, even though I'm warned that I'm using developer override. Is it supposed to still dl the prod version? ๐Ÿคท๐Ÿฝ

@tplive It will be downloaded if you run terraform init when using provider development overrides. if so, please skip terraform init, It is not necessary and may error unexpectedly.

BTW, The PR #13409 depends on the trace log and confirmation information provided by viper4u. The same error maybe caused by different reasons, so, if the tf config and trace log (after applying PR #13409) can be provided, I would try to find the reason for the failure.

tplive commented 2 years ago

@sinbai thank you for the advice. I've stripped it down further, so the azurerm provider is the only thing remaining, it deploys only a single kv secret now, and still fails.. :) I can send you the config and trace log if you like to see what that looks like?

sinbai commented 2 years ago

@tplive Please send the config and trace log to sinbai@hotmail.com, thank you.

dma-sitecore commented 2 years ago

@tombuildsstuff @sinbai The response from Azure Support

  1. The primary recommendation to resolve this issue is to move to Azure Resource Graph instead of the List Resources API. You can ignore all of the API implementation details and nextLink challenges and just make a single call to Search-AzGraph. When the portal displays a list of resources, it is done using Azure Resource Graph rather than the List Resources API. You can replace all of your code with just this one line:

Search-AzGraph -Query "resources | where type =~ 'Microsoft.KeyVault/vaults' and name =~ '$AzureKeyVaultName'" -Subscription $SubscriptionId

tplive commented 2 years ago

@tombuildsstuff I have been working with @sinbai on reproducing the issue and testing the #13409 patch, and I was able to accurately repro the bug in our production environment, both with the latest 2.79.0 version and confirmed that the patched version also still has the bug in our case. However, we are not allowed to send tracelogs from our production environment to external parties, so we are unable to participate in further testing, since our test environment does not repro the bug, unfortunately. I will still closely monitor this issue, as it is still important to us to have it resolved. Thanks for all your great help and patience @sinbai !!

sinbai commented 2 years ago

@dma-sitecore Thank you for the information. The PR of moving to Resource Graph API instead of Resource API when retrieving Keyvault is ready.

Is anyone willing to build it and verify that this issue can be fixed with this PR because I canโ€˜t repro it?

tplive commented 2 years ago

Is anyone willing to build it and verify that this issue can be fixed with this PR because I canโ€˜t repro it?

I'll ask my management to find the time for me to do the test. I still have the setup since the last test, so it shouldn't be too much work to get going.

alwalkerTH commented 2 years ago

Using the provider built from that PR solved the issue for me. I had a manually created Key Vault in another subscription entirely and every time I added a secret to it it would fail with the "root resource was present, but now absent" error and then subsequent applies would complain that the secret already existed.

I don't know if it's related but I also couldn't import the existing secret either. It was created but any time I tried to do terraform import it would complain that it didn't, yet when I ran apply it complained that it did.

edit: I should have tested further. While it does not produce the "root resource was present, but now absent" error; subsequent runs of apply see the secret as having been deleted from the vault and try to recreate it. Only to get an error that it already exists.

edit 2: The same behavior also now applies to Key Vaults themselves created from terraform. If I run apply and create a Key Vault then run apply again that existing Key Vault will have been deleted according to terraform but still actually exists in Azure.

tplive commented 2 years ago

(sorry I first posted this on the PR, then realized I should have been posted it here instead :) )

@sinbai I have now tested this PR with our environment. I add four secrets in deployment, and for testing I made sure two of them already existed. The first apply will report that it should create four secrets. Then will apparently deploy two secrets correctly, and correctly report that the two other secrets already exist and have to be imported into state.

However, when I run the same deployment a second time, it reports that it will again create four secrets, and says all four already exist. So the two secrets were deployed on the first run, without error, but were not added to the state, as far as I can tell.

The difference now is that the error message about provider being in a inconsistent state.

sinbai commented 2 years ago

(sorry I first posted this on the PR, then realized I should have been posted it here instead :) )

@sinbai I have now tested this PR with our environment. I add four secrets in deployment, and for testing I made sure two of them already existed. The first apply will report that it should create four secrets. Then will apparently deploy two secrets correctly, and correctly report that the two other secrets already exist and have to be imported into state.

However, when I run the same deployment a second time, it reports that it will again create four secrets, and says all four already exist. So the two secrets were deployed on the first run, without error, but were not added to the state, as far as I can tell.

The difference now is that the error message about provider being in a inconsistent state.

@tplive Thank you very much. I want to clarify that this PR only applies to the errors "root resource was present, but now absent", and the fix depends on the trace log provided. So, could you help confirm that this PR can indeed fix that error? Thanks again for all your help.

tplive commented 2 years ago

@tplive Thank you very much. I want to clarify that this PR only applies to the errors "root resource was present, but now absent", and the fix depends on the trace log provided. So, could you help confirm that this PR can indeed fix that error? Thanks again for all your help.

I can confirm that the "root resource was present, but now absent" was not found in the trace log.

sinbai commented 2 years ago

@tplive Thank you very much. I want to clarify that this PR only applies to the errors "root resource was present, but now absent", and the fix depends on the trace log provided. So, could you help confirm that this PR can indeed fix that error? Thanks again for all your help.

I can confirm that the "root resource was present, but now absent" was not found in the trace log.

@tplive As far as I know, this error happens intermittently. So, can I consider that this PR can fix it, right?

tplive commented 2 years ago

@tplive As far as I know, this error happens intermittently. So, can I consider that this PR can fix it, right?

For us, the error happened every time. But if the fix is to remove the error message, then it does the job. :)

sinbai commented 2 years ago

@tplive As far as I know, this error happens intermittently. So, can I consider that this PR can fix it, right?

For us, the error happened every time. But if the fix is to remove the error message, then it does the job. :)

Thank you. Could you repro it in your local environment every time? If it is, could you provide me with step-by-step repro steps or important clues? That will be very helpful to me.

tplive commented 2 years ago

Thank you. Could you repro it in your local environment every time? If it is, could you provide me with step-by-step repro steps or important clues? That will be very helpful to me.

The error has been happening consistently, in our prod environment. When I set up a bare-bones test environment deploying the exact same secrets to the exact same Vaults, using the exact same Service Principal, I was unable to repro there.

The prod pipeline calls a module we have called "onboard", which among many other things creates the KV. After the KV is created it also stores some secrets in the KV, without problems. This onboard modules is called multiple times in this pipeline, and creates several KV and secrets.

The problem only happens later in the pipeline when we try to add secrets to the same KV created previously.

A lot of resources are being deployed in this prod environment, trace logs are several hundred megabytes and obviously contains a lot of confidential information, so I am unfortunately not allowed to share these logs with third parties not under confidentiality agreement. I'm also unable to go through all of it and anonymize the information..

Thanks again for you patience and help in this matter, it's much appreciated!

Thomas

aniro commented 2 years ago

We are facing the same issue on: hashicorp/azurerm v2.83.0 Terraform v1.0.9

tplive commented 2 years ago

We are facing the same issue on: hashicorp/azurerm v2.83.0 Terraform v1.0.9

Interested in hearing more about your particular situation, as we are also still suffering this bug. ๐Ÿ˜Š

aniro commented 2 years ago

@tplive

Interested in hearing more about your particular situation, as we are also still suffering this bug. ๐Ÿ˜Š

It is similar to the situations already described above. Keyvault is first created and populated in scope of terraform configuration step_1 without issues. Then the same keyvault is referenced in configuration step_2 by its id using data.azurerm_key_vault.keyvault_name.id. step_2 fails most of the time (but not always) on adding new secrets to the keyvault with the following error:

Error: Provider produced inconsistent result after apply
Aldiuser commented 2 years ago

Stack: hashicorp/azurerm v2.83.0 Terraform v1.0.7

Maybe it does not fit 100 percent, because here the error is described with Secrets, but I think the error pattern with Secrets and Certificates is the same here

I don't know why this behavior occurs, but I was able to find a solution for myself. I would like to share this solution with you, maybe it will help one or the other who has the same setup as me.

Basics: I create a certificate and then want to store it in two keyvaults, each located in different subscriptions. I also use the data resource to access these keyvault, because they were created by another terraform State.

My terraform apply fails with the message:

โ”‚ Error: Provider produced inconsistent result after apply โ”‚When applying changes to โ”‚ azurerm_key_vault_certificate.my-cert, provider โ”‚ "provider[\"registry.terraform.io/hashicorp/azurerm\"]" produced an โ”‚ unexpected new value: Root resource was present, but now absent. โ”‚ โ”‚ This is a bug in the provider, which should be reported in the provider's

As soon as I run terraform apply again, the message comes up that the certificate already exists.

However, I have now been able to form a solution:

I first created a new provider that points to the appropriate subscription

provider "azurerm" {
  alias           = "my-sub"
  subscription_id = "xxx-xxx-xxx-xxx"
  features {
    key_vault {
      purge_soft_delete_on_destroy = true
    }
  }
}

And I only have to specify this provider in azurerm_key_vault_certificate ressource via provider = azurerm.my-sub

resource "azurerm_key_vault_certificate" "my-cert" {
  provider     = azurerm.my-sub
  name         = "insert-your-certificate-name"
  key_vault_id = data.azurerm_key_vault.kv.id

For completeness I have added my data structure once again:

data "azurerm_key_vault" "kv" {
  provider            = azurerm.my-sub
  name                = "my-kv"
  resource_group_name = data.azurerm_resource_group.my-rg.name
}

After that terraform apply works fine

magodo commented 2 years ago

@tplive and @aniro, is there any chance that the module used to create the KeyVault is not the in the same subscription as the one used to create the secret (just as @Aldiuser mentioned)?

tplive commented 2 years ago

@tplive and @aniro, is there any chance that the module used to create the KeyVault is not the in the same subscription as the one used to create the secret (just as @Aldiuser mentioned)?

@magodo In our case, the the Vault has been previously created and exists in the same state, all in the same subscription.

magodo commented 2 years ago

@tplive I see. Then the root cause might be due to the resource list API issue, though I'm not certain sure. However, the PR #14047 should still fix this case hopefully.

aniro commented 2 years ago

@tplive and @aniro, is there any chance that the module used to create the KeyVault is not the in the same subscription as the one used to create the secret (just as @Aldiuser mentioned)?

@magodo both steps are configured in a similar way and use the same subscription in our case.

magodo commented 2 years ago

@tplive, @aniro do you mind to do a test on my PR in your environment where the issue occurs?

tplive commented 2 years ago

@magodo I will have time to test this today, hopefully to completion. I will report back the result asap.