undiverged not working correctly as a plan requirement

nwsparks commented 11 months ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

When running atlantis plan on an out of date branch, it results in inconsistent behavior as seen below. I suspect the undiverged check is not running early enough. It should probably execute before any plans are run at all? My guess is that parallel planning is throwing it off.

Reproduction Steps

Set up a repo that plans multiple paths in parallel. Make an out of date PR Run atlantis plan

Environment details

Atlantis version: 0.27.0
Deployment method: ecs

Atlantis server-side config file:

- id: /.*/
  apply_requirements:  [approved, mergeable, undiverged]
  import_requirements: [approved, mergeable, undiverged]
  plan_requirements:   [undiverged]

Repo atlantis.yaml file:

parallel_plan: true
parallel_apply: true

nitrocode commented 11 months ago

Did it work with 0.26.0?

nwsparks commented 11 months ago

no. undiverged didn't work at all until 0.27.0

https://github.com/runatlantis/atlantis/issues/3254 https://github.com/runatlantis/atlantis/pull/3832

kpocius commented 9 months ago

Tested this on 0.27.1 with both parallel plan and apply turned off -- it still goes ahead an happily plans the changes.

xakraz commented 9 months ago

Hi there 👋🏻

I am also interested in the undiverged command requirement 😇

Unfortunately, I was not able to validate its functionality. 😞 I never end up with an Atlantis comment like on the issue's screenshot for any of my TF project despite GitHub telling me the branch is out-of-date/behind the base branch.

Plan failed: Default branch must be rebased onto pull request before running plan

How does it work at all?

Notes:

I have tested with Atlantis 0.26.0 and 0.27.1
I have tested it with the checkout-strategy=merge since it is written in the doc that undiverged does not work with the branch strategy

Many thanks for your help and explanations 🙏🏻

EDIT

Ok I was able to validate it and indeed, it does not work with Atlantis < 0.27.0 However, I can not tell (yet) for parallel/multi-projects run 😅

albertorm95 commented 8 months ago

I just tested on 0.27.1 and didn't work 😢

lukemassa commented 6 months ago

I'm starting to look into this, and one question I have is, should we even support undiverged as a plan requirement? What's the benefit? Given that it only works in the merge strategy, if you're about to plan a PR that has diverged, atlantis is already doing the "right thing", which is incorporating main via the merge checkout before the plan. Why stop and tell the user to address the divergence if we're about to do it for them?

EDIT: To be clear I do agree that there is an issue here, at the very possibly an inconsistency due to a race condition. I'm also happy to work on a fix if we decide that's the right way to go, I just want to understand the use case better.

nwsparks commented 6 months ago

@lukemassa yah it's pretty important to get this working IMO. Here is a scenario

repository is correctly set up with branch protection and requirement that branches are up to date before merging
engineer makes PR, it sits for a day or two while other pr's are merged before it is reviewed with approvals
PR is now behind main. engineer runs atlantis plan which works.
After or during the plan engineer hits the update button to clear the blocker on merge
They can now freely apply the stale plan

Obviously they should be looking at the plan and noticing problems there, but there are many cases where something could be missed. Tired engineer, incompetent engineer, plan that would already have many changes in it and other small changes it is undoing are missed, etc.

Ideally the scenario above should have never let them run the plan since it was behind main, and ideally would not let them apply it either.

lukemassa commented 6 months ago

PR is now behind main. engineer runs atlantis plan which works.

My understanding of https://www.runatlantis.io/docs/checkout-strategy.html#merge, and I was able to confirm this at least in a toy example, is that when you go to run that plan, it checks out main at origin then does the merge. So the actual plan that's created is not behind. Is this not what you're seeing?

nwsparks commented 6 months ago

It does sound like that would solve it, but the setting defaults to branch...so anyone unaware of the setting could fall victim to problems with undiverged.

I'm not sure in which scenario anyone would even want to use the branch option, but if they did they might want undiverged options as well.

lukemassa commented 6 months ago

True, however undiverged is already documented to only work if checkout strategy is merge.

Applies to merge checkout strategy only which you need to set via --checkout-strategy flag.

From https://www.runatlantis.io/docs/command-requirements.html#undiverged

I'm not sure we have the ability to detect divergence in a branch situation.

So maybe we remove support for undiverged as a plan requirement (i.e., have startup fail if it's set, update docs). As noted it doesn't work currently so we're not losing much, and I don't think in general it's desirable.

Interested to hear from other commenters here @albertorm95 @xakraz @kpocius ?

EDIT: Additionally we maybe should have startup fail if undiverged is set as an apply/import requirement but checkout strategy is not set to checkout, since it's not going to work 🤔

xakraz commented 6 months ago

Thanks @lukemassa for giving a look at the issue 🙏

I like the startup fail idea but what about repos that use atlantis.yaml with override of a command requirement and specify undiverged despite the fact that Atlantis server is configured with branch checkout strategy?

Time does things well :) I should be able to test this with parallel run on multiple projects in the next weeks.

By the way, I was wondering about the usefulness of the undiverged option too given the branch protection rule requirement of the PR branch to be up-to-date while using Atlantis requirement of mergeable 🤔

albertorm95 commented 6 months ago

@xakraz

By the way, I was wondering about the usefulness of the undiverged option too given the branch protection rule requirement of the PR branch to be up-to-date while using Atlantis requirement of mergeable 🤔

There is a problem with the up-to-date branch protection, we have a monorepo of projects so its very hard (basically a race) if we enable that there.

nwsparks commented 6 months ago

@lukemassa thanks for all the details. I think the arguments against undiverged are reasonable and some testing with 7 environments in parallel does seem to indicate no issues with the merge strategy. Although interestingly undiverged seems entirely broken now vs partially broken when I opened this issue.

The last thing I have in favor of undiverged is that operations are much clearer and there is a better sense of security when the plan is blocked with a message indicating what is happening. Personally I prefer this over things being a bit magical behind the scenes. I understand if it is difficult to support it though.

albertorm95 commented 6 months ago

This feature is very useful for us, any way we can help on the investigation of how this got broken? @nwsparks @lukemassa

Thanks!

lukemassa commented 6 months ago

The last thing I have in favor of undiverged is that operations are much clearer and there is a better sense of security when the plan is blocked with a message indicating what is happening. Personally I prefer this over things being a bit magical behind the scenes.

Yeah one thing I'm still bit confused on is, if you've chosen the "merge" checkout strategy, you are explicitly indicating to Atlantis that you want it to handle the divergence behind the scenes. If you don't want that then you shouldn't use the checkout strategy. And if instead you use the "branch" checkout strategy, "undiverged" has no effect as documented (and as I mention above, we should probably fail at startup if you try).

@albertorm95 could you explain your use case? Are you using the merge or branch strategy?

albertorm95 commented 6 months ago

Sure, this is our scenario:

We have a monorepo where multiple teams collab and manage their infrastructure

We currently use branch strategy, we have no problem with switching to merge.
Our repository does not required rebase before merging since there are a lot of changes happening, it will be unmanageable.

Undiverged will help us to trust the plans generated, since in theory it will detect if there are changes in main that are not in the PR's branch.

cc: @lukemassa

lukemassa commented 5 months ago

We currently use branch strategy, we have no problem with switching to merge.

undiverged only applies to the merge checkout strategy, so unless I'm misunderstanding something there's no regression here.

The scenario you describe should be consistent with having an apply requirement for undiverged right?

Again, when using the merge checkout strategy, before actually running a plan, the content from the branch is "merged" with then fetched origin/main, so there's already no way to plan without new updates to main. A plan requirement that prevented that from happening would defeat the purpose of the merge checkout strategy entirely, at which point you might as well use branch.

I'm still leaning towards just an update to documentation/config validation to clarify this is an apply-only requirement.

cc @GenPage @jamengual

jamengual commented 5 months ago

I agree to do a Documentation update to make this clear.

runatlantis / atlantis