hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.59k stars 4.63k forks source link

Azure KeyVault Error: Provider produced inconsistent result after apply #11059

Open GregBillings opened 3 years ago

GregBillings commented 3 years ago

Community Note

Terraform (and AzureRM Provider) Version

Terraform v0.14.8 AzureRM: terraform-provider-azurerm_v2.52.0_x5

Affected Resource(s)

Terraform Configuration Files

resource "azurerm_key_vault_secret" "mysecretvalue" {
  name         = "secretvaluename"
  value        = var.some_value_from_var
  key_vault_id = data.terraform_remote_state.remote_terraform_cloud_state.outputs.key_vault_id
}

Debug Output

Error: Provider produced inconsistent result after apply

When applying changes to azurerm_key_vault_secret.mysecretvalue, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

This is a bug in the provider, which should be reported in the provider's own issue tracker.

Expected Behaviour

secret added to the keyvault

Actual Behaviour

The keyvault secret was added to the keyvault with the correct value, but the terraform apply failed with the error above. When re-running again, the new error is that the value already exists but isn't tracked in the terraform state

Steps to Reproduce

  1. Create a terraform script that creates an azure keyvault and then outputs the ID as an output variable
  2. Create another terraform script with a different remote state that pulls in the first remote state via:

    data "terraform_remote_state" "remotestate" {
    backend = "remote"
    
    config = {
    organization = "my-org"
    workspaces = {
      name = "first-remote-state"
    }
    }
    }
  3. Attempt to create a new keyvault secret using the id from the output of the first remote state:
    resource "azurerm_key_vault_secret" "mysecretvalue" {
    name         = "MySecretValue"
    value        = var.some_value_from_var
    key_vault_id = data.terraform_remote_state.remotestate.outputs.key_vault_id
    }

References

tplive commented 2 years ago

@magodo I have run through the test as done previously, and I can confirm that this appears to have solved the issue for us! πŸ‘πŸ½

cspring86 commented 2 years ago

We're seeing a similar issue - https://github.com/hashicorp/terraform-provider-azurerm/issues/14249

We've pinned the issue down to a call made by the TF provider to the Azure REST API. This one:

GET https://management.azure.com/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resources?$filter=resourceType eq 'Microsoft.KeyVault/vaults' and name eq 'this-is-my-kv'&$top=5&api-version=2020-06-01

Depending on where the Terraform script is run from (i.e. the IP of the machine that the TF script is run on), it either works or doesn't work.

Example:

Another thing to mention is that when using the Azure REST API to directly fetch the Key Vault by ID, it works.

It's purely this resource list filter call that has the issue.

Hope that helps resolve the issue.

cspring86 commented 2 years ago

The only alternative I can think of is that there's some kind of replication issue between regions.

If I remove the name eq 'this-is-my-kv' from the Azure REST API query, I do in fact get Key Vaults from eastus, even though I'm in the UK, but I don't see all of them - I don't see the most recently created Key Vaults.

yube-sitecore commented 2 years ago

Hello Everyone, Thank you everyone for your effort on this issue. It gave a lot of pain to us as well.

I just checked the freshest image, which is '1.1.0-beta2' and all my terraform deployments passed successfully, after over 20 failed attempts while using the 1.0.6 version.

I will test it a few days more and back to you with some feedback.

yube-sitecore commented 2 years ago

Unfortunately - no, the problem still be very stable.

mybayern1974 commented 2 years ago

@yube-sitecore , thank you for proactively updating here! In the meantime, would you mind provide your repro steps as detail as you can to serve us who are still tracking this issue including raising tickets to Microsoft KeyVault as well as ARM team?

Below are some mentioned repro steps but unfortunately we are unable to follow any of them:

  1. Programmatically create a new Azure subscription and right after that create KV and KV Sec. (Given I do not have an environment to programmatically create a subs so have not tried this yet)
  2. Use TF to create KV in region A (say WestUS) and then use TF to create KV Secret that depends on that KV in region B (say EastUS). This repro step is introduced in issue 14249. However, though we can follow that repro steps, we were still unable to repro.

Lastly, would that be feasible for you to provide correlationIDs during all your TF calls? With getting those ID the Azure ARM team might be able to track what happened in the backend. No worry if you are not sure how to get that and we are happy to share instructions accordingly.

cspring86 commented 2 years ago

Hey guys.

Just a heads up that our issue has been resolved by Azure Support.

They informed us that our issue was because our ARM caches were out of synchronisation. Since they manually triggered a synchronisation job, everything has been working ok for us.

This may well be the same for some of you guys. If you haven't already, I'd raise a ticket with Azure Support as my support contact noticed the issue straight away.

The key piece of information for them in my case was that all the key vaults could be seen from the Azure Portal, but when using TF, the CLI or the REST API you could only see a subset.

Hope that helps!

mybayern1974 commented 2 years ago

@cspring86 thank you for updating your story and glad to see you are unblocked!

While on the other hand, I may feel there are other people who may still feel blocked/inconvenient by this case, because as Terraform users, people may expect things to happen automatically and consistently, so for people who have repro steps that can trigger this issue and those steps need to happen regularly and automatically per some business needs, manually filing Azure support ticket might not be a solution to them.

I'm still trying to contact Azure support to get their API usage best practice, with that we may wanna accordingly tune Terraform implementation to introduce a clean refinement.

yube-sitecore commented 2 years ago

@yube-sitecore , thank you for proactively updating here! In the meantime, would you mind provide your repro steps as detail as you can to serve us who are still tracking this issue including raising tickets to Microsoft KeyVault as well as ARM team?

Below are some mentioned repro steps but unfortunately we are unable to follow any of them:

  1. Programmatically create a new Azure subscription and right after that create KV and KV Sec. (Given I do not have an environment to programmatically create a subs so have not tried this yet)
  2. Use TF to create KV in region A (say WestUS) and then use TF to create KV Secret that depends on that KV in region B (say EastUS). This repro step is introduced in issue 14249. However, though we can follow that repro steps, we were still unable to repro.

Lastly, would that be feasible for you to provide correlationIDs during all your TF calls? With getting those ID the Azure ARM team might be able to track what happened in the backend. No worry if you are not sure how to get that and we are happy to share instructions accordingly.

Hello and thank you for your effort, We are doing a lot of deployments from scratch every day using the same set of complex terraform scripts. The issue happens pretty often - it can work a few days, but after fails for 2-3 weeks during each deployment because of that issue.

I see that Maksym has provided a sample previously, https://github.com/hashicorp/terraform-provider-azurerm/issues/11059#issuecomment-915798838 . That's our case.

I see that we have opened a ticket to MS, but there is no clear details - because the ticket was closed - as can't reproduce.

I tried to find some patterns with a time of the day\locations\etc - no, it's totally random for us. I will try to find more details from guys who worked on this issue previously and will back to you.

Also if any logs can be helpful - I am ready to collect and give them for the troubleshooting needs.

cspring86 commented 2 years ago

@cspring86 thank you for updating your story and glad to see you are unblocked!

While on the other hand, I may feel there are other people who may still feel blocked/inconvenient by this case, because as Terraform users, people may expect things to happen automatically and consistently, so for people who have repro steps that can trigger this issue and those steps need to happen regularly and automatically per some business needs, manually filing Azure support ticket might not be a solution to them.

I'm still trying to contact Azure support to get their API usage best practice, with that we may wanna accordingly tune Terraform implementation to introduce a clean refinement.

@mybayern1974 I 100% agree. To your point, and @yube-sitecore 's by the looks of it too, we're still seeing this issue as well and it's very frustrating. Like @yube-sitecore , we can go days without hitting the issue and then I've hit the issue again today - same pipelines, no changes.

Based upon the response I got from Azure Support, my only guess is that there's some persistent issues with ARM caching at the moment, either longer than expected delays or just never being updated, possibly due to some high load on Azure (similar to what we saw at the start of the pandemic).

yube-sitecore commented 2 years ago

@cspring86 thank you for updating your story and glad to see you are unblocked! While on the other hand, I may feel there are other people who may still feel blocked/inconvenient by this case, because as Terraform users, people may expect things to happen automatically and consistently, so for people who have repro steps that can trigger this issue and those steps need to happen regularly and automatically per some business needs, manually filing Azure support ticket might not be a solution to them. I'm still trying to contact Azure support to get their API usage best practice, with that we may wanna accordingly tune Terraform implementation to introduce a clean refinement.

@mybayern1974 I 100% agree. To your point, and @yube-sitecore 's by the looks of it too, we're still seeing this issue as well and it's very frustrating. Like @yube-sitecore , we can go days without hitting the issue and then I've hit the issue again today - same pipelines, no changes.

Based upon the response I got from Azure Support, my only guess is that there are some persistent issues with ARM caching at the moment, either longer than expected delays or just never being updated, possibly due to some high load on Azure (similar to what we saw at the start of the pandemic).

We face the same for quite a long time. It's a big blocker for us. I plan to start an investigation of this issue from the scratch, as we need to find a workaround whenever it will be.

mybayern1974 commented 2 years ago

@cspring86 thank you for updating your story and glad to see you are unblocked! While on the other hand, I may feel there are other people who may still feel blocked/inconvenient by this case, because as Terraform users, people may expect things to happen automatically and consistently, so for people who have repro steps that can trigger this issue and those steps need to happen regularly and automatically per some business needs, manually filing Azure support ticket might not be a solution to them. I'm still trying to contact Azure support to get their API usage best practice, with that we may wanna accordingly tune Terraform implementation to introduce a clean refinement.

@mybayern1974 I 100% agree. To your point, and @yube-sitecore 's by the looks of it too, we're still seeing this issue as well and it's very frustrating. Like @yube-sitecore , we can go days without hitting the issue and then I've hit the issue again today - same pipelines, no changes.

Based upon the response I got from Azure Support, my only guess is that there's some persistent issues with ARM caching at the moment, either longer than expected delays or just never being updated, possibly due to some high load on Azure (similar to what we saw at the start of the pandemic).

@cspring86 may I possibly get your contact, say email address or whatever, though I'm not sure the best practice to share that for respecting privacy. The purpose is I wanna get detail of your saying "Filing an azure support ticket could unblock things once, though later still repro", specially, that ticket number or Azure support folk contact. With getting that, I plan to possibly reach out to some Azure internal service team to help clarify a concern that I intuitively have: If manual efforts can flush to make things work, why cannot that magic happen automatically/by-default?

techy321 commented 2 years ago

Terraform version v0.13.4 Azurerm provider 2.88.1

Following and hoping for a fix. We are experiencing the same issue with our pipeline sporadically. The keyvault gets created during an earlier stage in the pipeline and these secrets get created during a later stage in the pipeline. The secrets appear in the portal but not in state.

Error: Provider produced inconsistent result after apply

When applying changes to module.SQL.azurerm_key_vault_secret.sql_password, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

techy321 commented 2 years ago

Terraform version v0.13.4 Azurerm provider 2.88.1

Following and hoping for a fix. We are experiencing the same issue with our pipeline sporadically. The keyvault gets created during an earlier stage in the pipeline and these secrets get created during a later stage in the pipeline. The secrets appear in the portal but not in state.

Error: Provider produced inconsistent result after apply

When applying changes to module.SQL.azurerm_key_vault_secret.sql_password, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

I've also noticed during times when this occurs, we are also not able to complete an az keyvault list either even though the keyvault is shown in the portal. Once cli can list keyvault again, we are able to re-run pipeline successfully.

techy321 commented 2 years ago

Terraform version v0.13.4 Azurerm provider 2.88.1 Following and hoping for a fix. We are experiencing the same issue with our pipeline sporadically. The keyvault gets created during an earlier stage in the pipeline and these secrets get created during a later stage in the pipeline. The secrets appear in the portal but not in state. Error: Provider produced inconsistent result after apply When applying changes to module.SQL.azurerm_key_vault_secret.sql_password, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

I've also noticed during times when this occurs, we are also not able to complete an az keyvault list either even though the keyvault is shown in the portal. Once cli can list keyvault again, we are able to re-run pipeline successfully.

Breaking discovery!!! If you add a tag to the keyvault, prior to running tf it will be able to successfully find the kv.

yube-sitecore commented 2 years ago

Terraform version v0.13.4 Azurerm provider 2.88.1 Following and hoping for a fix. We are experiencing the same issue with our pipeline sporadically. The keyvault gets created during an earlier stage in the pipeline and these secrets get created during a later stage in the pipeline. The secrets appear in the portal but not in state. Error: Provider produced inconsistent result after apply When applying changes to module.SQL.azurerm_key_vault_secret.sql_password, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

I've also noticed during times when this occurs, we are also not able to complete an az keyvault list either even though the keyvault is shown in the portal. Once cli can list keyvault again, we are able to re-run pipeline successfully.

Breaking discovery!!! If you add a tag to the keyvault, prior to running tf it will be able to successfully find the kv.

Hello @techy321 , Could you share details, please - what tag you have added and in what way?

techy321 commented 2 years ago

Terraform version v0.13.4 Azurerm provider 2.88.1 Following and hoping for a fix. We are experiencing the same issue with our pipeline sporadically. The keyvault gets created during an earlier stage in the pipeline and these secrets get created during a later stage in the pipeline. The secrets appear in the portal but not in state. Error: Provider produced inconsistent result after apply When applying changes to module.SQL.azurerm_key_vault_secret.sql_password, provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value: Root resource was present, but now absent.

I've also noticed during times when this occurs, we are also not able to complete an az keyvault list either even though the keyvault is shown in the portal. Once cli can list keyvault again, we are able to re-run pipeline successfully.

Breaking discovery!!! If you add a tag to the keyvault, prior to running tf it will be able to successfully find the kv.

Hello @techy321 , Could you share details, please - what tag you have added and in what way?

I just added a tag with any value to the key vault in question, immediately after the cli was able to return the keyvault if i do a list, tf as well.

yube-sitecore commented 2 years ago

Thank you for sharing - I will test this workaround from our end and will write an update for other interested people if it works for us as well.

asebetich commented 2 years ago

Any updates on this :)?

I am creating an "azurerm_synapse_linked_service" and specifically am trying to create a connection to a Key Vault and also an SFTP server. The resource "azurerm_synapse_linked_service" is looped using the for_each command.

This looped resource creates the keyvault Linked Service then uses that linked service to create the SFTP linked service (stores password details in the key vault).

When running the plan, i get all green check marks! However, running the apply provideded me with the following error: Provider produced inconsistent result after apply value: Root resource was present, but now absent.

This is a bug in the provider, which should be reported in the provider's own issue tracker.

Any help is appreciated! I am going to try and dissect the for_each and create one at a time to see if i get a different result

yube-sitecore commented 2 years ago

Hello @asebetich , Welcome to the club =)

We have the issue when we create secrets and after reading them within the same script run. As a solution - in case of failure - we implementing retryer which will import resources into the state. The idea that import into state is also not stable in this scenario, and as someone found - adding a tag to the resources doing some effect - we don't know what and how it affects, but the fact that if in case of failure we will add any tag (as described above) to resource, import existing resources into state and execute it again - execution will pass.

asebetich commented 2 years ago

@yube-sitecore, hmm. I might not fully understand your suggestion here.

So in my specific case, I use a data call to grab some secrets for my Azure Synapse Workspace (sql username/pw). I do not create the Key vault in my main.tf file, i just reference it and the associated secrets i need with the data resource block.

That being said, when the Synapse resource is already created, I can't seem to create these linked services. I removed my SFTP Linked Service and just tried to create one using the Key Vault connection and still nothing.

So what resource are you suggesting I add a tag to?

yube-sitecore commented 2 years ago

I agree with you @asebetich , that our cases are a bit different. But the root cause seems the same.

The issue that during the planning phase - azure reporting that secret exists. But when terraform requests it - It can't be returned and you see this error message.

There are many ideas why it's happening, we spend months with Microsoft trying to get the root cause and find that it can be related to broken paging implementation within the endpoint which used by terraform. Later another guy shared their experience with Microsoft support and it was different.

Here we discuss how at least to workaround the problem, as the problem is still actual and not resolved between terraform and Azure.

In our case workaround:

  1. Catch this exact error
  2. Add tag with any value (like test:test) to KeyVault resource
  3. Run terraform import command to import explicitly existing secrets which already exist, and which terraform can't obtain during apply phase.
  4. Execute apply again and it will pass

Otherwise, it can be unpredictable how many times you will be required to execute it to bypass. Another finding from our end is that in most cases issues occur when KeyVault is deployed within a different location from the place where you execute your script. Like KeyVault deployed in West Europe, but you machine\agent\doesn't how you run your terraform executed from Australia - almost in all cases this execution wouldn't pass from the first attempt. But at the same time - no guarantee that it will pass successfully from first attempt if those locations will be collocated.

asebetich commented 2 years ago

@yube-sitecore Thanks for the quick responses! I see what you mean now with literally the workaround being to add a tag? haha very odd behavior for that to work.

I initially packaged all the synapse workspace related items into 1 module for my own organization, and thought that separating the linked_service creation to not be included into the module might somehow fix this issue, but alas, nothing changed and I keep getting the same error we are all facing.

I'm going to continue to troubleshoot this and will now try to see if literally manually adding a tag to the Key Vault helps. All my resources are in the same region, so that part doesn't apply to me. I'll keep this thread updated with my findings (if i have any)

yube-sitecore commented 2 years ago

Yes, @asebetich add tag and import secret data into state explicitly.

michaelelleby commented 2 years ago

Importing into state explicitly is very impractical for production deployments, as the person deploying usually will not have permissions to modify the state for production. This needs a real fix.

yube-sitecore commented 2 years ago

@michaelelleby yes, agree. But we face this issue mostly during the initial provisioning time.

We were not able to resolve this issue with Microsoft. I will consider moving from KeyVault to Hashicorp Vault potentially, as this issue was an open a year ago.

magodo commented 2 years ago

@cspring86 Just want to echo @mybayern1974's ask, is it possible to provide some hint (e.g. the support name, id, etc) about the support ticket that manually resolved your issue as you mentioned above? So that we can internally find the support guy to learn details, and hopefully figure out some solution to resolve this issue.

magodo commented 2 years ago

Another update is that I can (intermittently) reproduce this with following steps:

  1. Check out and build one of my personal branch for the provider: https://github.com/magodo/terraform-provider-azurerm/tree/endpoint (diff against the main branch). The only diff is that this branch added an environment variable: ARM_RESOURCE_MANAGER_ENDPOINT_PREFIX, which allows users to target to a (region) specific ARM endpoint
  2. Use above provider to provision the example template, which the azurerm_key_vault_secret commented out, and ARM_RESOURCE_MANAGER_ENDPOINT_PREFIX set to eastus
  3. Uncomment the azurerm_key_vault_secret part and run terraform apply again, but this time with ARM_RESOURCE_MANAGER_ENDPOINT_PREFIX set to soemthing different than eastus, e.g. norwayeast

I ran above steps for 3 times, and the issue occurs 2 times.

So another way to get it fixed is to ensure the regional ARM endpoint used to create the key vault nested resources is the same as the resource group, which contains the key vault (based on this comment). There is one issue #15632 that asks for adding the regional endpoint support for the provider, just FYI.

ravulachetan commented 2 years ago

Hi I am seeing the same error with keyvault at adding secrets. Do we have a solution on this?

magodo commented 2 years ago

@RavulaChetan Not really, only a workaround: https://github.com/hashicorp/terraform-provider-azurerm/issues/11059#issuecomment-1018661692

ravulachetan commented 2 years ago

Thanks @magodo; Is this work around for keyvault already existing and we add secrets? What about if the keyVault is created in the same code as adding secrets?

Also when the work around says add Tag to the KeyVault; can you elaborate more please.

Azrrael commented 2 years ago

Hi everyone. We run into this issue a few weeks ago, while using Terraform 1.0.0 with azurerm 2.96.0. We tried the tagging (both to the actual secret and the Vault itself) solution and we also requested a ARM cache sync to Azure support, but neither worked. When trying to import manually the values manually to our .tfstate, we got an error pointing out the said values didn't exist, contradicting the results of an apply. Also and as an FYI, the Vault in question has purge protection enabled.

magodo commented 2 years ago

@RavulaChetan The workaround is to add a regular resource tag in the key vault via other means (e.g. portal, azure-cli), and optionally remove the tag. This might (no proof) trigger a refresh of the ARM cache in the target regional endpoint, which causes your next GET call on its /resources endpoint returns the key vault. As @Azrrael stated, they found the workaround above doesn't work either, which means this is not a reliable workaround..

For a long term solution, we might have following solutions:

  1. Allows the provider to set ARM regional endpoint, which is tracked in https://github.com/hashicorp/terraform-provider-azurerm/issues/15632
  2. Change the resource id of the key vault nested resources, which is tracked in #16174
vijaytdh commented 2 years ago

In case this helps anyone else but I received this error (Root resource was present, but now absent. and the error Cannot import non-existent remote object when attempting to import) too but in my case it was my own silly mistake because in my code I had multiple providers defined for different subscriptions.

When adding the secret I was passing the key vault resource id via a datasource that used the right provider, however in the module that creates the secret I was not specifying the provider so it was defaulting to the subscription for the default provider - which in my case is not where the key vault resides for the secret I was adding - I only noticed it when enabling the trace logging to see the actual API request that was made.

Divya1388 commented 2 years ago

Facing the same error when writing terratests for SQL FOG which includes creation of keyvault and CMK Key. And it is very consuming to trougleshoot this generic error and left without any clue now. Have tried adding time_sleep resource to wait after the keyvault creation and also tried specifying providers where the script is creating keys and secrets. None of them worked. Will try adding tag to the keyvault

meilz381 commented 2 years ago

I had also had the error "Root resource was present, but now absent.". The key vault is in another subscription (created by a different setup). Maybe someone else finds it useful: I was able to solve my issue by putting the key vault resources(secrets, keys) in an extra module and setting the provider config there with the other subscription id.

Azrrael commented 2 years ago

Hi there. We run into this issue again. We were using a homemade module to create a DB and users, saving two passwords, passwordA and passwordB on Key Vault. We have multiple environments and in particular, Dev and QA share a common Key Vault, called Lower. We used the exact same code for all environments, changing only parameters such as environment name, DB size, etc. When we run our code, it worked in all environments except Dev. So, we have two calls with the same code, saving passwords on the same key vault, with one working and the other failing. Furthermore, we tried "run apply, then create a tag on the secret, then run import" and this time it worked, but not on the first try but a couple of days later. Not sure why is this happening, but here's hoping this info helps somebody.

brisitw commented 2 years ago

We are facing the same issue when creating secrets via azurerm_key_vault_secret with terraform azurerm provider 3.5.0.

o1da commented 2 years ago

Hi guys, I had exactly the same issue too, with azurerm provider 2.99.0 and 3.9.0.

I was passing ID of a keyvault directly as a string, then I wonder maybe I am passing wrong value and tried to read keyvault with data source and pass ID parameter from there. Then I realized I can't read that keyvault as I am not on correct subscription...

So az account set -s ... or rather setting subscription with the provider did the trick for me.

Pretty misleading error message I would say :)

Tbohunek commented 2 years ago

Thanks @o1da! Took me 3 hours to give up and google to find your answer. Turns out Terraform has issues with cross-subscription Vaults (on top of issues with cross-tenant Vaults).

If you use full providers rather than ARM_xxx env variables or az cli, solution is to specify subscription_id within provider and then select the right provider in your resource to have the same subscription as the vault...

provider "azurerm" {
  alias           = "myprovider"
  subscription_id = "xxx" # <-- this is the important part
  tenant_id       = "yyy"
  use_msi         = true # or client_id + client_secret
  features {}
}

data "azurerm_key_vault" "vault" {
  provider            = azurerm.myprovider
  resource_group_name = "myrg"
  name                = "myvault"
}

resource "azurerm_key_vault_secret" "mysecret" {
  provider     = azurerm.myprovider
  name         = "myseret"
  value        = "secret"
  key_vault_id = data.azurerm_key_vault.vault.id
}
RobertFloor commented 2 years ago

We have the same problem, (a little bit different we provide the "azurerm provider" to a separate module). The module then should sync keys from a key vault in a different subscription compared to the one we are using. However, somehow terraform believes the secrets are not there anymore and proceeds to delete them from the state. It then tries to re-add them but they are still there and this gives an error. Maybe our problem is a little bit more complex since we are running the code in an Azure devops pipeline.

module "test" {
  resource-group-kv = "test1"
 key-vault-name = "test2"
  providers = {
    azurerm = azurerm.sub
  }
}

provider "azurerm" {
  features {
    key_vault {
      purge_soft_delete_on_destroy = true
    }
  }
  alias           = "sub"
  subscription_id = "xxxxx"
}

In the module

data "azurerm_key_vault" "azvault" {
  name                = var.key-vault-name
  resource_group_name = var.resource-group-kv
}

resource "azurerm_key_vault_secret" "secret-developer-write" {
  name         = "super-secret"
  value        =  "secret"
  key_vault_id = data.azurerm_key_vault.azvault.id
}

We get the error:


Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":
RobertFloor commented 2 years ago

In our case the root case was this issue : https://github.com/hashicorp/terraform-provider-azurerm/issues/8680 (Azurerm providers in module will use az account active subscription instead of the configured credentials). We could solve the problem by including the provider both in the calling code and in the module:

We do get this warning, but it still solves the problem and terraform loads the state correctly, including a key vault from a different subscription.

β•·```

β”‚ Warning: Redundant empty provider block β”‚ β”‚ on ../../modules/kafka-topic/backend.tf line 17: β”‚ 17: provider "azurerm" { β”‚ β”‚ Earlier versions of Terraform used empty provider blocks ("proxy provider configurations") for child modules to declare their need to be passed a provider configuration by β”‚ their callers. That approach was ambiguous and is now deprecated. β”‚ β”‚ If you control this module, you can migrate to the new declaration syntax by removing all of the empty provider "azurerm" blocks and then adding or updating an entry like β”‚ the following to the required_providers block of module.kafka-topics: β”‚ azurerm = { β”‚ source = "hashicorp/azurerm" β”‚ configuration_aliases = [ β”‚ azurerm.subscription-config, β”‚ ] β”‚ }

MrBasset commented 2 years ago

Just to document another scenario where this problem appears to occur, similar to others here.

We have a central Key Vault in which we store shared secrets. We were using data sources to pull back account details that were then used to create TLS certs that are then loaded into a subscription specific key vault. In our case, we missed explicitly adding the provider to the data source (which still worked), but that somehow caused the wrong subscription ID to be used when refreshing the state on the Key Vault (or secrets within it).

magodo commented 2 years ago

πŸ‘

17407 has just been merged. Appreaciate anyone suffering from this issue can test whether that solves your issue in the upcoming new release v3.12.0, or via a local build using the main branch!

elongstreet88 commented 2 years ago

Still seeing terrible performance for keyvault secrets and the perpetual value:[]; nextlink: "..." with 3.12.0 release :(. I found https://github.com/hashicorp/terraform-provider-azurerm/pull/13409 was rejected, however, testing it locally, it speeds up our deploys nearly 5x:

Without https://github.com/hashicorp/terraform-provider-azurerm/pull/13409

With

While i think there are a few different issues going on at the same time here, could we re-consider 13409 for a quick fix for the performance aspect? I'm happy to PR it again if needed. I think this is uniquely affecting people with have hundreds of keyvaults in their subscriptions like us, and that $top logic on the filter wrecks us the more we add.

Thanks.

Tbohunek commented 2 years ago

@magodo I just tested and sadly it doesn't work for me with 3.12.0 and tf 1.2.4 or I do something wrong. ☹️ It's stuck at searching for Resource Group.

Error: retrieving the Resource ID the Key Vault at URL "https://myvault.vault.azure.net/": retrieving Vault: 
(Name "myvault" / Resource Group "myvaultrg"): keyvault.VaultsClient#Get: Failure responding to request: 
StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="ResourceGroupNotFound"
Message="Resource group 'myvaultrg' could not be found."

With specifying provider on data azurerm_key_vault_secret then it works, which makes me wonder if https://github.com/hashicorp/terraform-provider-azurerm/pull/17407 even could have solved that - it was on azurerm_key_vault, but problem is on _secret|_certificate|_key no?

provider "azurerm" {
  subscription_id = "sss_default"
  ... the usual stuff
}

provider "azurerm" {
  alias           = "myprovider"
  subscription_id = "sss_foreign"
  ... the usual stuff
}

data "azurerm_key_vault" "vault" {
  provider            = azurerm.myprovider
  name                = "myvault"
  resource_group_name = "myvaultrg"
}

data "azurerm_key_vault_secret" "secret" {
  provider     = azurerm.myprovider # doesn't work without this, error above
  name         = "mysecret"
  key_vault_id = data.azurerm_key_vault.vault.id
}

Note that resource azurerm_key_vault_secret fails as well - it actually creates the secret, but then fails to verify it's presence, as reported in https://github.com/hashicorp/terraform-provider-azurerm/issues/11782:

When applying changes to azurerm_key_vault_secret.psk, provider
"registry.terraform.io/hashicorp/azurerm" produced an unexpected new value:
Root resource was present, but now absent.
magodo commented 2 years ago

@Tbohunek Can you provide more details about your use case by providing the config and log?

Tbohunek commented 2 years ago

@magodo what log exactly? I've updated the above config.

Nothing fancy: Vault is in another subscription, so even if I use the correct provider for data azurerm_key_vault, I still need to use that provider also for all stored values, be it resource or data.

For other resource types I'm used to it being able to work out from the .id which includes /subscriptions/sss and hence not needing a provider to select the subscription.

This "problem" prevents some use-cases such as passing external var.vault_id into the module, because when you change to vault in another subscription than the module uses, it breaks. This is actually how I "found" this issue in the first place.

magodo commented 2 years ago

@Tbohunek Yes, in cross subscription use, the "other resource types" works because their id has the subscription id in it. While as the id of the key vault nested items are using the data plane id, which doesn't have the sub id. So when reading them, the provider assumes the subscription is the current provider's subscription.

I'd like to confirm that the config in https://github.com/hashicorp/terraform-provider-azurerm/issues/11059#issuecomment-1172423670 works with specifying the provider alias.

Tbohunek commented 2 years ago

@magodo I see. But why does it then search for it in a resource group instead of using the data-plane ID which is globally unique? And where does it get the resource group from, if not from the resource_id? It's weird. But yes, specifying the provider in both resources does the trick.