Open javajawa opened 10 months ago
Without adding additional complexity in atlantis and building the code for each git hosting service (github, bitbucket, etc), we can defer this to said system.
For GitHub (however this can be used with other cicd systems), have you considered
Then the workflow would be the following for renovate and human users
That indeed works excellently for GitHub.
GitLab does not have the concept of have specific external jobs be expected. (It treats CI at the pipeline level, not as a set of jobs). This makes the workflow described above less portable, but it is possible by looking for the atlantis/apply
job on the current pipeline.
Additionally, the latest version of atlantis does not seem to register the apply check as part of the pipeline until
atlantis apply
is directly, even if there are no planned changes.That only leaves text parsing as the only currently available option (I'm aware of) for detecting empty plans.
Additionaly, the latest version of atlantis[2] does not seem to register the apply check as part of the pipeline until atlantis apply is directly
Was this an intentional change? I'm seeing this behavior change, running atlantis v0.27.1 and Gitlab 15.11.
Did some more detailed testing of behaviour for Atlantis v0.27.0
against self-hosted premium gitlab at 15.11
.
Atlantis is reporting the the atlantis/apply
as a complete job in the pipeline if there are no changes.
Therefore, I can in principle write some tooling which checks for this in conjunction with the author of the merge request being renovate bot, adds an approval, and triggers the actual apply (as "no plan changes" != "no state changes").
I will leave this feature request open because I believe the idea of atlantis applying actual labels or approvals is a capability worth continued dicussion, but I will attempt to edit the description to better reflect the reality of the problem.
I think it can be a problem to run apply in case of "No Changes", because all the workflows (at least custom ones) depend on plan file to exist. And you don't always want to exit successfully from atlantis/apply
if the plan file doesn't exist, because there can be other reasons why the plan doesn't exist..
Surely Atlantis should be saving a plan file even if there are no changes? Just because there's no API operations to perform doesn't mean there isn't other interaction.
I think we should possibly clarify the meaning of "no changes" here?
A terraform operation has four points of interaction -- the HCL code (configuration), the state file, the binaries being run, and configured infrastructure (reality)
Terraform reports "no changes" when the the binaries have no changes to make to reality based on the configuration. I am presuming that Atlantis maintains this contract. However, there are a number of other operations that happen during an terraform apply
that may still be relevant:
moved{}
blocks are evaluated into stateThese are all changes that should be applied to correct sync the four points, but they are not going to affect reality (and thus in the workflow I'm describing, do not need prior human authorization).
Terraform doesn't create a plan file if there are no changes, it exits with non-zero error code instead.
That's...trivially provable as "not true"?
mkdir test && cd test || exit 1
echo 'terraform {}' >main.tf
terraform --version
terraform init
terraform plan -out plan.tfplan
printf "plan result: %d\n" "$?"
base64 plan.tfplan | head -n 1
The exact value and behaviour depends on the options -- see the documentation for -detailed-exitcode for example where you only get an exit code of 0 if there are no planned changes.
But the key point is that "no changes" results in an applyable plan that updates the statefile
benedict@junco:~/test$ terraform apply plan.tfplan
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
benedict@junco:~/test$ cat terraform.tfstate
{
"version": 4,
"terraform_version": "1.7.1",
"serial": 1,
"lineage": "3110dbfe-bf87-e157-63e5-7840f3dc1440",
"outputs": {},
"resources": [],
"check_results": null
}
Additionally, if I did follow the suggestion there from terraform --version
and upgrade the terraform binary, a re-planned, I would get a different plan with still no API changes, and applying that plan would result in a new different state file.
(Aside: I have however just learnt that terraform only checks for the presence of a .tf
file; there's no need for any content. Neat!)
I was pretty confident that it does save a plan file if no changes and it returns a zero status code if it's successful.
If you run atlantis apply, you can also see the terraform output for the no-changes plan. Im on mobile but there is a pr that we can look at to see the exact code changes that made this possible.
But maybe I'm mistaken. Happy to be corrected.
Also this request is related to
Here is the PR https://github.com/runatlantis/atlantis/pull/3378
Im unsure what version introduced it since the milestone is missing. Most likely a recent version.
For the record, it is listed as a change in v0.25.0.
Thanks for the link to the PR (and this the issue that spawned it). Neither seem to have addressed any concerns around external changes or other things that would cause state file but not real world changes?
Looking at #266, this comment on the nature of the Terraform Core Workflow principles I think underlines the distinction between workflows with plans and those with no plan but corrective state.
We could look at merging this issue into #266, though I feel there is still some difference around the idea of Atlantis taking a more concrete action than adding an external job (especially in non-GitHub environments which don't have the concept of required external jobs).
Update on the gitlab front: as Atlantis uses (well, has to use) commit statuses rather than workflow jobs, there are no associated events to bind a webhook to.
Currently, the best way I can find to do this in a self hosted environment (without making the changes to Atlantis described above) is to watch for the note being added to the MR when atlantis finished planning, then look up the commit status of the head of the MR.
Community Note
Describe the user story
As an operations/engineering/SRE team, we would like to keep change via terraform in a controlled workflow, whilst allowing non-changes to proceed without becoming toil.
Consider a situation with over 50 AWS accounts managed via terraform, stored in a self-hosted gitlab, with renovate and atlantis running. Renovate will create a merge request for each provider upgrade for each account, which will generate in 95+% cases a set of empty terraform plans.
In this case, an approver will have to come along, approve each request, and apply the change (which makes no write API calls to AWS, but will update the state file with the new provider/module version numbers).
In order to reduce fatigue on the approvers, they should only have to look at the merges where a change to the controlled infrastructure is happening. In the no-op case, the tools should be able to resolve the issue autonomously.
Describe the solution you'd like
Atlantis, upon successfully completing all plans for a merge request, knows whether there are any changes, either between reality and state, or reality and config. If there are no such changes, it can undertake to help move the merge along.
The following are workflows I feel make sense.
atlantis:planned
,atlantis:no-changes
. A separate process can then consume these and other metadata ("merge is from renovate") and perform the actions as part of a different control loop. This information is technically exposed via job statuses.The minimum viable outcome is any mechanism by which an approval bot can verify that altantis has completed up-to-date plans for all modules in a merge request, and that they contain no changes. This does already exist (see conversation below), but some of these options may result in a better user experience.
Describe the drawbacks of your solution
Automatically starting Atlantis actions other than plan is not the current convention (at least in the Gitlab ecosystem, I am less familiar with how Atlantis integrates with other VCS providers).
Label usage is simple in some systems, but may not be consistent across all officially supported providers.
Describe alternatives you've considered
As highlighted by the label-based workflow, most of the actual behaviour here can be achieved by use of a different component, keeping Atlantis itself more focused. However, determining that a given merge is fully planned and has no changes in a consistent manner is the true problem here.