Open fivetran-seanmeisner opened 8 months ago
Thanks @fivetran-seanmeisner this seems mostly doable. We'll put it on the backlog to plan/schedule it 👍
At this point, I think what we can do here is properly detect whether a delete fails and stop that propagating to the rest of the terraform operations. That should make it more atomic, however, the order of operations is up to terraform itself. It builds up a graph of operations and runs things in parallel. So technically it could (should?) be running each destroy before the corresponding create but what might happen in practice is that all destroys end up happening first
Following up on discussions with Buildkite support
We ran a Terraform apply that resulted in attempted destroy and create to replace a group of pipelines.
Because a number of these pipelines were running, the apply errored out and failed partway through.
The behavior was:
Destroyed ALL pipelines <-- Failed Skipped Create ALL pipelines
We'd rather this apply did not fail, or if it fails then at least it should create all the pipelines that it has destroyed before stopping.
I'd suggest that the fatal error on running pipelines should be changed to a warning that doesn't stop the apply from recreating.
It would also make sense to issue each destroy/create as an atomic pair, one at a time, rather than destroying everything up front.