Azure / deployment-stacks

Contains Deployment Stacks CLI scripts and releases
MIT License
87 stars 7 forks source link

Proposal: removing snapshots from stacks #60

Closed bmoore-msft closed 1 year ago

bmoore-msft commented 2 years ago

Over the past few weeks we've had some 1:1 conversations with many of you and that has helped us to refine our thinking on scenarios for the snapshots child resource on a stack. TL;DR we want to remove the snapshot resource from the stack.

Currently a snapshot is created for every PUT on the stack and it contains details about the operations of the stack. Most of this detail is also part of the stack itself. The original intent is that the snapshots are used for "lazy" debugging (meaning not necessarily immediate since the stack itself can be used for immediate debugging) and some scenarios where a previous state of a stack could be applied via a snapshot since the snapshot retains the template and parameters from a deployment.

Through these conversations (and many internal ones) it became clear that simply applying a snapshot may not be sufficient to restoring the state of an environment. Pipelines often have many steps in addition to a template deployment that would also need to be "applied" should an environment need to revert changes. As well, the source of truth for the environment, now becomes a snapshot instead of code checked into source control. So while a snapshot apply could be an interesting feature, it may not be as helpful is we originally thought. As well, learning to rely on it may be a anti-pattern when source code is the source of truth.

As for "debugging" we see 2 scenarios - one is the "immediate" debugging from a failed stack where the snapshot contains some detailed information. In many cases, that information is already part of the stack or the response and we would fill those gaps should we remove snapshots.

The other is for "longer term" debugging - in this case the stack has changed state many times, maybe over days/weeks and there is a need to find which iteration of the stack deleted a particular resource, for example. For this we want to leverage the Activity Log - as this is the place for all other/similar operations and we don't want to fragment the approach that customers use for common operations (e.g. DELETE). Removing the snapshot also reduces the conceptual overhead for customers trying to learn how to leverage logging in Azure for auditing these operations.

That's the proposal and our thinking behind it - let us know if you have any feedback.

slavizh commented 2 years ago

I am ok with this especially if other features for the stack benefit from this in being more refined on release. Is there any tentative date for the next preview? Besides some of the API changes on stacks can we also expect better experience when we use stacks with Az PS module? Previous experience as I gave feedback was not on par from what we have in regular deployments.

bmoore-msft commented 2 years ago

We do want to make the experience of creating a stack, a superset of creating a deployment [resource]. That should be in the next release and feel free to nit-pick on anything that is not up to par.

As for when that will happen? We're shooting for Q3 - there are a lot of code changes so I suspect will end up test/debug/redeploy once or twice before we're done. And I've learned not to predict how long a deployment will take.

slavizh commented 2 years ago

@bmoore-msft no worries. I will not hold you on that date. You know me that I prefer that things get released when they are in good shape rather releasing something and than having pulled back. It gives me rough estimate on what to focus now and what I can be focusing in the future.

henrybeen commented 2 years ago

I think there are two strong arguments I hear:

I second those.

What we would loose (being able to diff between to snapshots) is something that sounds like a great feature, but I cannot say that I have ever missed it before now. So far there were always other ways to find the answer anyway (source control mainly)

tvuorenmaa89 commented 2 years ago

This makes sense and the debugging scenarios / solutions looks good.

J0F3 commented 2 years ago

I fully agree with what is already written here. I perfectly fine with the decision. I especially welcome the idea to leverage the Activity Log for logging and "long term debugging".
Also the point that the source code should be the source of truth and not bothering that with snapshot is something that a find very good. So removing the snapshots feature makes it also less likely that people who are new to IaC and CI/CD will mess up things or get confused by it.