hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.6k stars 4.65k forks source link

Improved behaviour of azurerm_resource_group_template_deployment on error #9713

Open sharebear opened 3 years ago

sharebear commented 3 years ago

Community Note

Description

When applying a new azurerm_resource_group_template_deployment resource, if the deployment fails, for example due to insufficient permissions, a referred resource not existing, or a necessary provider not being enabled (all three issues I had last week) then the id of the deployment is not commited to the terraform state. This has the consequence that a subsequent call to terraform plan identifies the resource for creation, but terraform apply fails with the following error message

Error: A resource with the ID "/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Resources/deployments/eventgrid-deployment" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_resource_group_template_deployment" for more information.

Importing the resource doesn't really help as the template contents haven't changed so terraform doesn't take any actions to redeploy the template.

The behaviour I think I would prefer here is;

  1. if the apply fails creating a depoyment, it should still be confirmed if the resource was created, adding the id to the state if it could be found.
  2. deployment state should be checked during terraform plan, if the state is "failed" it should trigger a diff that would cause terraform apply to execute redeploy on the deployment.

I'm still fairly fresh in the Azure world, so I'm open for citicism to whether my expected behaviour is reasonable.

New or Affected Resource(s)

Potential Terraform Configuration

Concrete test case can be produced if desired.

References

KillianW commented 3 years ago

I am seeing the same problem (very infrequently). The underlying cause for me seemed to be a failure on a single inner resource which was left in a 'corrupt' state leaving the deployment marked as a fail. At present, my only option has been to manually remove the deployment and the resources it did manage to provision and then rerun the TF Plan & Apply steps.

marrobi commented 3 years ago

@sharebear @tombuildsstuff think this would be very useful. Having to go manually delete the deployments after every failed template deployment is very frustrating.