hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.59k stars 4.63k forks source link

Azurerm_frontdoor with v2.24.0 breaks when azure frontdoor is edited in portal. #8208

Closed andrstor closed 3 years ago

andrstor commented 4 years ago

Community Note

Terraform (and AzureRM Provider) Version

Terraform v0.12.21
+ provider.azurerm v2.24.0

Affected Resource(s)

Terraform Configuration Files

provider "azurerm" {
  version = "=2.24.0"
  features {} # https://www.terraform.io/docs/providers/azurerm/index.html#features
}

resource "azurerm_resource_group" "example" {
  name     = "andreastester"
  location = "norway east"
}

resource "azurerm_frontdoor" "example" {
  name                                         = "andreastester"
  resource_group_name                          = azurerm_resource_group.example.name
  enforce_backend_pools_certificate_name_check = false

  routing_rule {
    name               = "exampleRoutingRule1"
    accepted_protocols = ["Http", "Https"]
    patterns_to_match  = ["/*"]
    frontend_endpoints = ["exampleFrontendEndpoint1"]
    forwarding_configuration {
      forwarding_protocol = "MatchRequest"
      backend_pool_name   = "exampleBackendBing"
    }
  }

  backend_pool_load_balancing {
    name = "exampleLoadBalancingSettings1"
  }

  backend_pool_health_probe {
    name = "exampleHealthProbeSetting1"

  }

  backend_pool {
    name = "exampleBackendBing"
    backend {
      host_header = "www.bing.com"
      address     = "www.bing.com"
      http_port   = 80
      https_port  = 443
    }

    load_balancing_name = "exampleLoadBalancingSettings1"
    health_probe_name   = "exampleHealthProbeSetting1"
  }

  frontend_endpoint {
    name                              = "exampleFrontendEndpoint1"
    host_name                         = "andreastester.azurefd.net"
    custom_https_provisioning_enabled = false
  }
}

Debug Output

https://gist.github.com/andrstor/0aa07440e0a01befb23351db3257340f

Panic Output

Expected Behavior

Terraform identifies that no changes are required or tries to recover its state.

Actual Behavior

Error: flattening backend_pool: ID was missing the healthProbeSettings element

Steps to Reproduce

  1. terraform apply
  2. Do anything in the azure portal that trigges a change. For instance add a rule engine rule to the routing rule.
  3. terraform plan

You can also undo the manual change again, the resource is still broken for azurerm v2.24.0. This works with v2.23.0

Important Factoids

None

References

alec-pinson commented 3 years ago

Hi, I think I'm still getting the same issue, is anyone able to confirm this is fixed for them?
We're in the North Europe region.

When trying to import I get the following:-

Error: Error parsing Resource ID "/subscriptions/0000000-0000-000-0000-000/resourceGroups/NEUR-RG/providers/Microsoft.Network/frontdoors/my-frontdoor": ID was missing the `frontDoors` element

Moved from another state I had already and:-

Error: flattening `frontend_endpoint`: ID was missing the `frontDoorWebApplicationFirewallPolicies` element
camallen commented 3 years ago

I'm still seeing the previous errors as well on our TF front door resources

Error: flattening `backend_pool`: ID was missing the `healthProbeSettings` element
Error: flattening `routing_rules`: flattening `frontend_endpoints`: ID was missing the `frontendEndpoints` element 
Poil commented 3 years ago

Same here, still have problem, do we need a new version of the provider to match the change ?

tombuildsstuff commented 3 years ago

@WodansSon given that was just the portal, is there a timeline for the API fix too?

WodansSon commented 3 years ago

@WodansSon given that was just the portal, is there a timeline for the API fix too?

The API on the service side was fixed months ago, it was the Portal part that took more time to release. I suspect this maybe an issue with legacy resources that were created before either were fixed. I will need to figure out how to create a Frontdoor service with these issues in order to get a repro so I can best understand how to mitigate the legacy services. For net new Frontdoors, this has indeed been fixed from my testing with and without WAF.

WodansSon commented 3 years ago

@camallen @Poil @alec-pinson @tombuildsstuff

I have done a bit of an experiment and what I have found is the easiest way to fix a legacy resource that is stuck in this in between state is to modify the resource in the Portal and save the changes. Once the changes are saved you can then manage the resource via Terraform. When I was investigating this I disabled one of my routing rules, update, then saved. Once the modification to the service was complete via portal I was then able to successfully manage it again with Terraform. You have to modify the configuration of the Frontdoor directly, adding or updating a Tag will not trigger the code that rewrites the id's with the correct casing.

NOTE: The provider version should be the latest version as there have been changes made to this resource to account for various other casing issues.

Poil commented 3 years ago

After editing in the portal I still have a problem on frontDoorWebApplicationFirewallPolicies (Error: flattening `frontend_endpoint`: ID was missing the `frontDoorWebApplicationFirewallPolicies` element) that is still frontdoorWebApplicationFirewallPolicies in the resource explorer

alec-pinson commented 3 years ago

Hi, same for me, I disabled a routing rule, tried terraform and got the error below, then enabled the routing rule and tried again just incase, but same thing.

Error: flattening `frontend_endpoint`: ID was missing the `frontDoorWebApplicationFirewallPolicies` element

When trying to import (I corrected frontDoors casing, thank you!) And same if I try to use a state that already has the frontdoor resource

provider versions

Terraform v0.13.5
+ provider registry.terraform.io/hashicorp/azuread v0.11.0
+ provider registry.terraform.io/hashicorp/azurerm v2.36.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
GarethOates commented 3 years ago

Maybe this should be re-opened?

naikajah commented 3 years ago

I agree this ticket should be re-opened. I tried updating the provider to the latest 2.36.0 and have the same issue.

Error: flatteningfrontend_endpoint: ID was missing thefrontDoorWebApplicationFirewallPolicieselement

timja commented 3 years ago

is the remaining issue perhaps just in frontDoorWebApplicationFirewallPolicies?

As the last 3 reports are mentioning that while the initial report was about healthProbeSettings

kplantus commented 3 years ago

@WodansSon I'm throwing my hat in as another who still has an issue with the FD WAF policy.

I even tried to create a brand new AFD. After it created, I modified a value via the portal to see if the rewrite workaround would work for a new resource. Still get:

Error: flattening `frontend_endpoint`: ID was missing the `frontDoorWebApplicationFirewallPolicies` element
WodansSon commented 3 years ago

When reporting issues, please keep in mind to remain respectful and professional, I understand that this is frustrating and I am doing the best that I can to correct the issue, however some aspects are out of my control. Since WAF is it's own separate resource I am going to assume the workaround will have to be applied to that resources as well(e.g. open WAF in portal, update, save) to correct the casing issues in that resource. Since it is it's own resource I will have to reach out to the service team again to verify that the fix that was applied to that resource as well. I will open a new issue for the WAF specifically since this issue for frontdoor appears to be fixed.

WodansSon commented 3 years ago

I have conferred with other team members and we have agreed that we will continue to track the issue on this issue instead of splinting it up across multiple issues. Thank you for your patience.

kplantus commented 3 years ago

Did a little bit more testing and found that on the brand new AFD I could modify a few things via https://resources.azure.com and bring it under TF management using TF v13.5 and azurerm provider v2.36.0.

under "frontendEndpoints" / "webApplicationFirewallPolicyLink" I updated the id of each FE from all lowercase to frontDoorWebApplicationFirewallPolicies.

The next plan failed on:

Error: flattening `routing_rules`: flattening `frontend_endpoints`: ID was missing the `frontendEndpoints` element

so back in resources under "routingRules" / "frontendEndpoints" I updated the id of each RR from all lowercase to frontendEndpoints.

My next tf plan and apply worked.

WodansSon commented 3 years ago

@kplantus Yes, that is exactly what I was suspecting was going on as well. I have done some digging myself and found this appears to be an issue in the API instead of Portal this time. I am already in contact with the service team to get an ETA for a high pri fix and deployment. However, that is still in negotiations with the team, at the same time I am also in contact with the portal team to ensure that the lower casing of the ID issue isn't also in their UI layer.

NOTE: Please ensure that the provider you are using is at least v2.24.0 as that is the version of the provider where a substantial amount of casing normalization was add to the provider.

naikajah commented 3 years ago

Thanks for the update @WodansSon

I can confirm the steps mentioned by @kplantus does not work with an existing AFD. I also tried deleting the linked WAF policies with the AFD and then running the terraform apply but got the same error. Will wait for the fix.

WodansSon commented 3 years ago

Thanks for the update @WodansSon

I can confirm the steps mentioned by @kplantus does not work with an existing AFD. I also tried deleting the linked WAF policies with the AFD and then running the terraform apply but got the same error. Will wait for the fix.

Hi @naikajah, what version of the provider are you using? I updated my comment above to state that the provider should be at least v2.24.0.

naikajah commented 3 years ago

Thanks for the update @WodansSon I can confirm the steps mentioned by @kplantus does not work with an existing AFD. I also tried deleting the linked WAF policies with the AFD and then running the terraform apply but got the same error. Will wait for the fix.

Hi @naikajah, what version of the provider are you using? I updated my comment above to state that the provider should be at least v2.24.0.

I tested it with v2.36.0

camallen commented 3 years ago

I can confirm the steps mentioned by @kplantus (thank you πŸ‘ ) do work for us on our existing AFD resources. We do not have any WAF configured.

Tested with v2.36.0 of the resource provider and Terraform v0.13.5

It's not ideal to edit the resources directly in the azure portal and I'm not sure what will happen if we edit the AFD resources in the portal again, I assume we might re-break the AFD resource definitions.

Hopefully this is useful for the Azure portal team and helps someone else get TF working again.

WodansSon commented 3 years ago

@camallen Your AFD will be fine now since you corrected the casing that was introduced by the portal lowercase issue. You are safe to edit the Frontdoor via portal as well since they have reverted the changes they made that caused this issue in the first place.

UPDATE: I have just heard back from the portal team and they have confirmed that the lower casing issue also exists in the Frontdoor Web Application Firewall Policies UI layer and they are currently working on a fix for that. I have yet to hear back from the AFD team, but I will keep you posted on the progress of this issue once I receive more information. Thank you. πŸš€

windsurfer123 commented 3 years ago

We have a very similar problem. We had to upgrade recently to 2.24 because of APIM issues and now when we are trying to manage the existing front door resources we get errors. Set up: tf either 13.5 or 12.9 azurerm v 2.37 Problem 1: when creating resources terraform sits on the "Still creating ..." for over an hour, at which point it is terminated. All objects are getting created properly. On subsequent runs it want to update the FD, which of course never finishes and has to be terminated. Problem 2: when trying to import the Frond Door resource created by terraform in Problem 1, we get Error: Error parsing Resource ID ""v ID was missing the frontDoors element Clarification we have never edited our instances of FD using the portal. After upgrading to azurerm 2.24 (APIM Issues) we had started receiving _Error: flattening frontendendpoint: ID was missing the frontDoorWebApplicationFirewallPolicies element the only way we found working was to destroy the existing FD and rebuild it using the new version of the provider. Update: tf v12.9 + azurerm v2.37 eventually completes building new FDs. Unlike tf v13.5 + azurerm v2.37 which never completes.

brunoscota commented 3 years ago

same happening here. I had to rollback azurerm version to v2.23.0 as a workaround. Now, I am stuck. ;(

camallen commented 3 years ago

@camallen Your AFD will be fine now since you corrected the casing that was introduced by the portal lowercase issue. You are safe to edit the Frontdoor via portal as well since they have reverted the changes they made that caused this issue in the first place.

Sadly, after editing our FD resources in the Azure portal we now have the same old error and broken TF

Error: flattening routing_rules: flattening frontend_endpoints: ID was missing the frontendEndpoints element`

Looks like the portal is still changing the resource definition from frontendEndpoints to non expected case for the Azure provider.

E.g. in the https://resources.azure.com/ portal this is what I see (with redactions) for one of the broken FD resource routing rules, note the different casing on different routing rules

{
  "id": "/subscriptions/.../resourcegroups/.../providers/Microsoft.Network/Frontdoors/.../FrontendEndpoints/my-custom-domain-org-azurefd-net"
},
{
  "id": "/subscriptions/.../resourcegroups/.../providers/Microsoft.Network/frontdoors/.../frontendendpoints/my-custom-domain"
}
KyMidd commented 3 years ago

Hey @WodansSon and $MSFT internal team, is it possible to provide an ETA for when this collection of resources will be repaired at the API layer? So far we're getting updates after a fix is made but nothing to give us an estimate of how long we'll need to wait for this fix to be implemented. Are we looking at days/weeks/months?

The healthcare teams I support use these resources for security and the time-frame you provide can help inform if we should switch to an alternate IaC solution or other workarounds.

Thanks in advance! kyler

WodansSon commented 3 years ago

@KyMidd Thank you for your question. I am still attempting to get in contact with the Front Door service team about this and have begun to escalate the issue internally to force some action about getting this fixed. Unfortunately I am not able to provide an ETA at this time, but as soon as I have any new news I will promptly update the status here.

subesokun commented 3 years ago

Somehow I'm now in a kind of dead lock. I need to downgrade to 2.23.0 because of this bug but when I do so I'm running into another frontdoor related TF bug (https://github.com/terraform-providers/terraform-provider-azurerm/issues/8036) which was solved in 2.24.0. Very frustrating.

WodansSon commented 3 years ago

@subesokun I am sorry to hear that, but please be patient, @tombuildsstuff and myself have been working on this issue and I believe we have a solution in the provider which should fix 99.999% if the issues that are currently being hit... We are still testing the solution, but so far it all looks good... again, I am sorry for this pain, but we are doing all we can to correct this issue.

tombuildsstuff commented 3 years ago

Ultimately this'll be fixed via #9750 - which ignores the casing returned from the Azure API and rewrites this to be consistent on Terraform's side, whilst there's downsides to that approach (and the Service Team ultimately need to fix these bugs in the API..) this should workaround this series of API bugs for the moment.

KyMidd commented 3 years ago

@WodansSon and @tombuildsstuff : Sincere thank you from myself and on behalf of the community here for working on a workaround for this issue. I have several teams affected, and despite us pushing hard on Microsoft's internal enterprise support, we haven't made any progress at all. I think we're all aware here that the real root cause of this issue is standard-breaking API implementation on Azure side, and from your user-base I want to say: Thank you.

ajklotz commented 3 years ago

@WodansSon @tombuildsstuff Is there a ticket that we can go and add our voice of support? Anything we can do to help MS understand there is attention from all of us regarding it?

WodansSon commented 3 years ago

@ajklotz I don't believe there is a public facing ticket or support request for this issue. That said, I have included a link to this issue in all internal communications for this problem so they are very much aware of the contention the changes in the API have caused. Thank you for asking!

camallen commented 3 years ago

Thanks @tombuildsstuff & @WodansSon for fixing this 🎩 and thanks to all the folks that contributed to the issue πŸ‘.

At 9th Dec 2020 10:25 UTC it appears that this fix is unreleased https://github.com/terraform-providers/terraform-provider-azurerm/blob/2df9c3193a43380e16f0000f3366b221b31d6c74/CHANGELOG.md#2400-unreleased

Is there a timeline for this fix being officially released? I'm very keen (and I assume so are a lot of folks) to test these fixes and get TF integrations working again.

FWIW I can't see how to use this unreleased provider version via https://www.terraform.io/docs/configuration/provider-requirements.html#version-constraints - maybe I'm missing something....

Any ideas or is it wait till the official release?

katbyte commented 3 years ago

@camallen, this will go out with our weekly release this Thursday (tomorrow)

KyMidd commented 3 years ago

I'm sorry to report that I compiled azurerm provider from master as of this morning and we are seeing the same type of issues on a refresh where terraform's unable to handle.

$ terraform refresh
(removed)
Error: expected "custom_rule.0.match_condition.0.selector" to not be an empty string, got 
  on main.tf line 27, in resource "azurerm_frontdoor_firewall_policy" "frontdoor":
  27: resource "azurerm_frontdoor_firewall_policy" "frontdoor" {
Error: expected "custom_rule.1.match_condition.0.selector" to not be an empty string, got 
  on main.tf line 27, in resource "azurerm_frontdoor_firewall_policy" "frontdoor":
  27: resource "azurerm_frontdoor_firewall_policy" "frontdoor" {

The FrontDoor and FrontDoor WAF rules were created from terraform using the new provider. Note the differing capitalization of "frontDoor" vs "frontdoor". I redacted the account, otherwise unchanged.

/subscriptions/./resourceGroups/TestingKyler/providers/Microsoft.Network/frontDoorWebApplicationFirewallPolicies/KylerTestingWafPolicy

And from the portal:

/subscriptions/./resourceGroups/TestingKyler/providers/Microsoft.Network/frontdoorWebApplicationFirewallPolicies/KylerTestingWafPolicy

If possible, I'd like to see this issue reopened. Thanks team.

WodansSon commented 3 years ago

@KyMidd Sorry that you are hitting that issue, however the casing should not matter anymore, I believe you are hitting a validation rule. Do you have a repro for this? If I can get a clear repro for this issue I might be able to get a fix in before the next release. Thank you.

I just looked at the code, what appears to be going on is that you have selector = "" in the config file, which would trigger the below validation rule. Can you confirm if this is the case or not in your config file?

"selector": {
    Type:         schema.TypeString,
    Optional:     true,
    ValidateFunc: validation.StringIsNotEmpty,
},
KyMidd commented 3 years ago

Hey @WodansSon , Oh, I assumed Terraform was modifying the casing on import.

As for repro, definitely.

  1. Using this config, simple WAF + FrontDoor: https://github.com/KyMidd/azurerm-frontdoor-testing-repro
  2. Compile from master azurerm and store properly
  3. Authenticate to Azure via az login
  4. terraform init (should find local azurerm)
  5. terraform apply <-- Resources built properly
  6. terraform refresh <-- Terraform fails with error message

Let me know if I can help further by testing a PR. Thank you!

KyMidd commented 3 years ago

@WodansSon : Update, I have destroyed and recreated this same config several times now to make sure the same error is generated, and I'm unable to replicate it. I'm not sure what happened there. I am now seeing FrontDoor reliably managed by terraform!

WodansSon commented 3 years ago

@KyMidd, I am unable to repro the behavior you are reporting above. That would explain my results as well! Awesome! πŸš€

I followed your steps with the same config file and when I execute step 6 I get:

Refresh

azurerm_resource_group.testing_kyler: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro]
azurerm_frontdoor_firewall_policy.frontdoor: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro/providers/Microsoft.Network/frontDoorWebApplicationFirewallPolicies/KylerTestingWafPolicy]
azurerm_frontdoor.example: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro/providers/Microsoft.Network/frontDoors/XXXXXX-testing-frontdoor]

Plan

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

azurerm_resource_group.testing_kyler: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro]
azurerm_frontdoor_firewall_policy.frontdoor: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro/providers/Microsoft.Network/frontDoorWebApplicationFirewallPolicies/KylerTestingWafPolicy]
azurerm_frontdoor.example: Refreshing state... [id=/subscriptions/{subscription}/resourceGroups/XXXXXX-frontDoor-Repro/providers/Microsoft.Network/frontDoors/XXXXXX-testing-frontdoor]

------------------------------------------------------------------------

No changes. Infrastructure is up-to-date.

This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, no
actions need to be performed.
ghost commented 3 years ago

This has been released in version 2.40.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.40.0"
}
# ... other configuration ...
jacksondaw commented 3 years ago

This issue was still occurring for me when I added a second frontend. I tried upgrading to 2.40.0 and downgrading to 2.23.0, both were throwing this error in relation to the existing frontend.

Error: updating Custom HTTPS configuration for Frontend Endpoint "*" (Front Door "" / Resource Group "FrontDoor"): unable to enable/update Custom Domain HTTPS for Frontend Endpoint "" (Resource Group "FrontDoor"): enabling Custom Domain HTTPS for Frontend Endpoint: frontdoor.FrontendEndpointsClient#EnableHTTPS: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="That action isn’t allowed in this profile."

I ended up pulling the https configuration out into the frontdoor_custom_https_configuration and error still throws, but the resources are created?

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error πŸ€– πŸ™‰ , please reach out to my human friends πŸ‘‰ hashibot-feedback@hashicorp.com. Thanks!