terraform-aws-modules / terraform-aws-atlantis

Terraform module to deploy Atlantis on AWS Fargate 🇺🇦
https://registry.terraform.io/modules/terraform-aws-modules/atlantis/aws
Apache License 2.0
520 stars 351 forks source link

Add high availability using redis for locking #322

Closed nitrocode closed 1 year ago

nitrocode commented 2 years ago

Is your request related to a new offering from AWS?

It is not

Is your request related to a problem? Please describe.

Id like an HA setup so atlantis can run plans across different projects simultaneously instead of linearly

Describe the solution you'd like.

Add option to create redis for locking, configure iam policy, add options to atlantis

Describe alternatives you've considered.

N/A

Additional context

Related https://github.com/runatlantis/atlantis/issues/1571

https://www.runatlantis.io/docs/server-configuration.html#redis-host

dynamike commented 2 years ago

Would it be possible to add some more docs for Atlantis on how to properly use Redis for high availability? That would help adding support for the functionality in this repo.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

nitrocode commented 1 year ago

unstale

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

nitrocode commented 1 year ago

Unstale

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

nitrocode commented 1 year ago

Unstale

bryantbiggs commented 1 year ago

Do we have a design spec for this yet?

nitrocode commented 1 year ago

Unsure about a design spec.

Basically

  1. Create redis (elasticache)
  2. Set appropriate inputs to atlantis to access the redis instance
  3. Ensure security group on elasticache allows ingress from atlantis pod/ecs-service
  4. Spin up more than one atlantis pod/ecs-service

At least, that's how I understand it.

andyshinn commented 1 year ago

What do the Redis hosts do right now if they do not do locking? I'd love to scale Atlantis a bit as it can be slow with many projects and when multiple repos plan / apply at the same time. The docs are unclear if it is for running multiple instances or something else.

nitrocode commented 1 year ago

The redis hosts do locking

https://www.runatlantis.io/docs/server-configuration.html#locking-db-type

bryantbiggs commented 1 year ago

The locking seems like the least of the concerns, where do the planned outputs live?

nitrocode commented 1 year ago

Hmm that's a good point. I'll reach out to the developers on this to see. Perhaps all of the instances need to share the same volume mount for this to work.

cc @lilincmu @SudoSpartanDan @jamengual

The PR that implemented this feature https://github.com/runatlantis/atlantis/pull/2491

bryantbiggs commented 1 year ago

maybe its just a naming misnomer - but it looks like the lockingDB is really storing the locks as well as the plans based on this definition https://github.com/terraform-aws-modules/terraform-aws-atlantis/issues/322#issuecomment-1446259859

bryantbiggs commented 1 year ago

Cool - I think it looks straightforward, thanks for sharing that PR @nitrocode

We'll need a few bits before we can add this but most are already in progress:

  1. Update ECS module to include support for service, task definition, and autoscaling. We have most of that already scoped out in this fork https://github.com/clowdhaus/terraform-aws-ecs so we can test and validate on a number of examples https://github.com/aws-ia/ecs-blueprints/pull/109 - once we get this validated I'll open a PR to the main project to get those changes reviewed and merged
  2. We'll need an elasticache cluster - I started on a module for this awhile back but never finished. I can revive that and see about getting it added
  3. Once those pieces are in place we can start carving out the changes here

This is great - I know this has been a long sought after feature so I'm excited to see this get added!

nitrocode commented 1 year ago

To save time, could we skip step 2 and use https://registry.terraform.io/modules/cloudposse/elasticache-redis/aws/latest module?

bryantbiggs commented 1 year ago

Possibly - the philosophy across the modules between the two groups is quite a bit different. Specifically the use of several nested sub-modules and the use of labels and context objects. We'll see

nitrocode commented 1 year ago

hmm on second thought, I do not think that redis holds the plans and the feature seems to be a bit wonky. I'm hesitant to push this forward until some of the bugs are fixed with it.

https://github.com/runatlantis/atlantis/issues/1571#issuecomment-1453776159

Thank you for considering.

dynamike commented 1 year ago

It would also be helpful to have more documentation in the Atlantis codebase around this feature, so folks know when and why they would enable it vs just running it by default.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.