Closed amitrana93 closed 2 years ago
So far Atlantis does not have functionality for that but I have been working on a branch for multienv but is it not complete but basically will allow you to have repo, multiple atlantis servers.
if you are curious the branch is : https://github.com/runatlantis/atlantis/tree/multiserver
@jamengual we'd love to see something like that being implemented. That model will truly support a multi-account environment within a monorepo approach. Perfectly suitable to people like me who uses Terragrunt in production
@jamengual Is it in your plans to keep working on it any time shortly?
Many thanks!
you can do it now, but the issue I have it multiple comments in git show up (for each atlantis server). We have a wrapper setup to only do actions if files are change in certain paths (the same as the env the atlantis server is in), so other accounts just leave a NOOP comment. I think the only way to prevent that right now is to have another wrapper for the webhook to eat it if its not applicable.
yes, my plan is to start working on this again in March, I'm too busy to do it right now.
@kitos9112 @amitrana93 you guys could build atlantis from that branch and use it and give me feedback, it will be pretty useful.
@tinder-tder Could you elaborate a bit on the wrapper setup? Quite curious on how it works
@tinder-tder Could you elaborate a bit on the wrapper setup? Quite curious on how it works
@patrickjahns sure, sorry for the delayed response. Its a simple script that atlantis calls instead of terraform/terragrunt directly. We check the change path against a regexp and if it doesnt match it exits with NOOP (this is a puppet template)
#!/bin/bash
#this script will wrap terragrunt to limit what gets applied based on the allowed path value
VALID="<%= @allowed_path -%>"
COUNT=$(echo $REPO_REL_DIR | grep -Ec "${VALID}")
#env
hostname
if [ $COUNT -eq 0 ]; then
echo "NOOP $1: $REPO_REL_DIR not a match for $VALID"
exit 0
else
echo "$1: $REPO_REL_DIR matched for $VALID"
fi
case $1 in
plan)
terragrunt plan -no-color -out=$PLANFILE
;;
apply)
terragrunt apply -no-color $PLANFILE
;;
*)
echo "unknown command $1"
exit 1
;;
esac
the repo yaml looks like
# atlantis server side repo config
repos:
- id: "<%= @repo_whitelist -%>"
workflow: terragrunt
workflows:
terragrunt:
plan:
steps:
- run: tgwrapper.sh plan
apply:
steps:
- run: tgwrapper.sh apply
@tinder-tder is $REPO_REL_DIR
internal Atlantis variable?
@ipeacocks - yes
REPO_REL_DIR - The relative path of the project in the repository. For example if your project is in dir1/dir2/ then this will be set to "dir1/dir2". If your project is at the root this will be ".".
https://www.runatlantis.io/docs/custom-workflows.html#reference
I got this setup...and just to be clear....even with a wrapper in place, you still get an empty comment in the PR like this:
I have a prod
, pre-prod
and non-prod
atlantis running and that's what happened when the script "no-op" exits while using atlantis to update atlantis :-)
We also would very much like support for multiple atlantis servers. We have 1 repo with 4 environments, and each of those environments is completely isolated. We want to point all production terraform at the production atlantis server.
consul
\_ sand
\_ qa
\_ integration
\_ production
vault
\_ sand
\_ qa
\_ integration
\_ production
Using the wrapper script suggested above does allow for multiple atlantis servers, however without the ability to hide empty runs, the discussion quickly becomes unreadable.
I managed to make it work without much problems using existing Atlantis options:
env {
name = "ATLANTIS_SILENCE_NO_PROJECTS"
value = true
}
env {
name = "ATLANTIS_REPO_CONFIG_JSON"
value = jsonencode(
{
"repos" : [
{
"id" : "github.com/myorg/my-repo",
"pre_workflow_hooks" : [{
"run" : "cp atlantis-${each.value.env}.yaml atlantis.yaml"
}]
}
]
}
)
}
# atlantis-dev.yaml
version: 3
projects:
- name: dev
dir: env/dev
autoplan:
when_modified: ["env/dev/modules/**/*.tf", "*.tf", "*.yaml"]
# atlantis-prd.yaml
version: 3
projects:
- name: prd
dir: env/prd
autoplan:
when_modified: ["env/prd/modules/**/*.tf", "*.tf", "*.yaml"]
The only thing we can't use now is the automergeable feature (because it could merge the PR if some plans apply and others don't).
The solution I implemented is similar to what @Leooo did:
terragrunt-atlantis-config generate --automerge=true --autoplan --parallel=true --create-workspace --create-project-name --output ./atlantis.yaml --filter ${ENV_NAME}
The problem with this is when changes are made to multiple environments under one PR, and auto-merge is configured. One atlantis can close the PR before the others have started/finished.
@FlorianNeacsu exactly. I'm looking for a way to add one github status check per atlantis instance atm (so have atlantis/apply-dev
and atlantis/apply-prod
names for status checks, say), so that the PR can't be merged until all checks / jobs pass. Not sure where this code sits on Atlantis side for now.
Short term mitigation is to use atlantis apply --auto-merge-disabled
until all plans apply successfully (maybe alias it on atlantis apply
), then use a final atlantis apply
for the final merge. meh.
EDIT:
That should solve everything (one more time, Atlantis has all the options we need, just need to find it in the documentation). Testing it now and if it works I will update the process above.
I built a custom Atlantis Proxy
that handles confirming all GHE status checks are green before making the PR mergable. My GHE hooks point to the proxy. The proxy forwards the request to the correct atlantis server. we have three AWS accounts basically tied to one GHE repo, where the repo has a folder for dev
, staging
, and production
. The proxy adds status checks to the PRs and handles cases where things like the README in the repo change and atlantis shouldn't get involved. This repo is basically a hub with terragrunt files that point to other repos where the TF actually lives
@jasonrberk happy to get your code - although I was reaching in the above for a simple solution using standard Atlantis options (no terragrunt etc.), and it's pretty close now.
I can't share the code base as it's not open source, but I can give a general overview
in my container running atlantis, I have a pre workflow and a custom workflow that looks like this:
(ghe
is a node script that takes args)
pre_workflow_hooks:
- run: ghe setCommitStatus ${HEAD_COMMIT} success
plan)
ghe configureBranchProtection
ghe dismissApprovals ${PULL_NUM}
ghe setCommitStatus ${HEAD_COMMIT} pending
terragrunt plan -out=$PLANFILE $DESTROY_PARAMETER | sed -E 's/^( *)([-+~]|-\/\+)/\2\1/;s/^~ /! /'
;;
https://docs.github.com/en/enterprise-server@3.0/rest/reference/repos#create-a-commit-status
the idea is that anytime atlantis plans something, it dismisses the approvals so the plan needs to be re approved and it sets a commit status that prevents a human from clicking the merge button after an approval, in case of a failed plan (ie: rubber stamping w/o validating the plan)
in order to merge your PR, you have to comment atlantis apply
. The proxy will see that comment and clean the status check.
(Yes, a human could comment atlantis apply
and manually merge before atlantis actually applies.....but we've found that smart people don't do this..... this was more to block the Pavlovian response to click the big green merge button in GHE)
the gist of the proxy is:
import express from 'express';
import httpContext from 'express-http-context';
import { PORT } from './utils/config.js';
import * as middleware from './middleware.js';
const app = express();
app.use('/events',
express.json(),
middleware.validateRequest, // validate the request came from GHE
httpContext.middleware, // https://www.npmjs.com/package/express-http-context
middleware.storeRequestMetadata, // so we can associate the logs with a specific GHE event
middleware.filterComments, // ignore any comment on the PR that doesn't start with 'atlantis'
middleware.extractPullNumber, // normalize the location of the pull number on the req object
middleware.addPrFilesToRequest, // get all the files in the PR so subsequent filters can make decisions
middleware.handleNonTfPulls, // escape hatch for PRs that don't change TF files
// from this point forward, at least _some_ of the files in the PR are actionable (ie .hcl) files
middleware.handleCrossAccountPulls, // filter out cross account / non-applicable PRs
middleware.forwardToAtlantis, // finally, forward the request to atlantis and return the response
);
app.get('/health', (req, res) => {
res.sendStatus(200);
});
app.listen(PORT, () => console.log(`starting Atlantis proxy on port ${PORT}`));
clear as mud?
I'm sure my use case is very specific to how my org does thing in AWS and how we have TF / TG and Terragrunt Frontend configured.
as with anything, I'm sure there's room for improvement, but this is working for us in that it:
dev
we don't want any comments from staging
or production
atlantis instances)this is possible by doing https://github.com/runatlantis/atlantis/issues/1345#issuecomment-1002625950
please report back otherwise
@Leooo : i have two instance (dev and prod )of atlantis and connect to one repo. I follow your way to setup pre-hook but i get the message error on prod instance about the workflow dev is not define on prod atlantis when run atlantis plan -p dev . How do you solve this issue ?
@lvthao not sure, it looks like your dev file hasn't been parsed. Have you declared one like the below in your repo. together with the ATLANTIS_REPO_CONFIG_JSON env var? Note that the name of the plan in the file must be dev
. Probably the Atlantis logs will give you more details. Can you also check that a simple atlantis plan
fails with same error.
# atlantis-dev.yaml
version: 3
projects:
- name: dev
dir: env/dev
autoplan:
when_modified: ["env/dev/modules/**/*.tf", "*.tf", "*.yaml"]
it was parsed the atlantis yaml file. When i ran the atlantis plan -p dev, both instance dev and prod will be run this cmd and on prod we only define atlantis-prod.yaml and there is no project for dev so that i got the error from atlantis prod instance. On the instance prod atlantis, i configed the /etc/atlantis/repos.yaml
---
repos:
- id: "xxx"
apply_requirements: ["approved", "mergeable", "undiverged"]
allowed_overrides: ["workflow"]
allow_custom_workflows: true
allowed_workflows: [dev,prod]
pre_workflow_hooks:
- run: cp atlantis-prod.yaml atlantis.yaml
workflows:
prod:
plan: xxx
run: xxx
atlantis prod:
version: 3
projects:
- name: prod
dir: .
workspace: prod
workflow: prod
@lvthao it looks like you have a different setup than the one I described. On my side I have separate atlantis-dev.yaml
, atlantis-prod.yaml
files in my repo (and no atlantis.yaml
), and I dynamically create an atlantis.yaml
file before processing in my pre_workflow_hook
by copying one of the atlantis-xx.yaml
files (depending on the env. being run)
i don't have the file atlantis.yaml on the repo. I copy this file by the prehook
- run: cp atlantis-prod.yaml atlantis.yaml ```
not sure then tbh. The atlantis logs should help you zoom into the error
@Leooo did you define the custom workflow on your atlantis repos.yaml ?
@lvthao I don't have any custom workflows / workflows object defined neither in the server nor in the repos. This is the content of atlantis-prd.yaml for example, in the repo:
version: 3
projects:
- name: prd
dir: env/prd
terraform_version: v1.3.2
autoplan:
when_modified: [ "../../modules/**/*.tf", "../../modules/**/*.yaml", "*.tf", "../../*.yaml"]
Hi Guys, We have a following scenario 2 AWS account and 1 repo for DEV, QA and PROD Environment.
One aws account for DEV . Second aws account QA and prod .
Right now, We are managing single repo for deployment in DEV ,QA and PROD and want to continue with that and connect multiple atlantis server via single repo.
Can you suggest the other solution to managing multiple Atlantis server for multiple account without the use of IAM assume roles as we dont want to use aws multi account.