Open pamelafox opened 2 months ago
I wonder what sort of semantics we want to apply here. The deployment state caching feature was built around the idea that folks wouldn't go mucking with resources managed by their azd
deployment in the portal, so we could trust the deployment history as a way to detect if we can skip things or not.
We certainly could augment the heuristic to say something like "it is safe to skip the ARM deployment if the most recent deployment is successful, the template hash of it matches our current template hash AND the resource groups impacted by the deployment still exist." That would address this issue. But imagine now that you delete some individual resource in that resource group. Do we expect that azd provision
should detect this and force a full deployment? What about changing a property on a resource using the portal? Is the expectation that that would be detected, and provision would be forced to run? In the limit we would be rebuilding the what the ARM deployment engine does and that doesn't seem right.
We added this check because we found in practice doing it was faster than submitting the deployment template and waiting for that no-op deployment to complete. This worked when you promised not to go behind azd
's back and muck around with your infrastructure. Deleting resources out of band is breaking that promise.
Maybe we should instead focus on improving the error behavior here, to advise the customer to run azd provision --no-state
in this case?
Or maybe there's a middle ground where we just ensure all the resources touched by the deployment still exist (or maybe just the RGs) but not look at individual property values when deciding if we can skip the deployment or not. I am worried in practice this is doing to make the end to end much slower and then maybe we arrive at a place where just submitting the deployment (and then working with the ARM Team to figure out how we can improve the speed of these no-op deployments) is the right call?
As a quick measure, perhaps you could mention azd provision --no-state
in the output when it skips deployment. I think I had to dig around for it.
I did end up using --no-state frequently today as I felt like it wasn't deploying my actually changed bicep (between azd env select calls), but I may have just been seeing things.
Output from
azd version
1fastapi-azure-function-apim % azd version azd version 1.8.0 (commit 8246323c2472148288be4b3cbc3c424bd046b985)
Describe the bug I ran
azd up
on a resource group that I probably deleted at some point. I got this error:To Reproduce
You probably need to run azd up, then delete the RG in Portal, then run again (with no Bicep change)
Expected behavior
It should have provisioned.
Environment Information on your environment: Mac OSX M1 Terminal
Additional context
To workaround, I will call
azd provision --no-state