aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.59k stars 3.89k forks source link

[pipelines] Change to a single-source, single-build pipeline, deploy+approve model #10872

Closed rix0rrr closed 3 years ago

rix0rrr commented 4 years ago

Realign the API model of the library more towards what's described in Automating Safe, Hands-off deployments.

pipeline

This means we explicitly give up a bunch of freedom that CodePipeline allows, in order to make a simpler application deployment model. We'll reject adding multiple source stages, multiple builds, and many types of actions in weird locations people might want to insert into the pipeline; if they want to do that, they should drop down to building a complete CodePipeline pipeline.

This means we do the following things:


I think the "action" model (with hooks for users to insert approvals) needs to look somewhat like this:

image

The "application approvals" should probably be configurable in direct style, it's okay if the changeset approvals require implementing an interface of some sort (not ideal, but acceptable).


This might entail making changes to the CodePipeline L2 library; but only in order to make implementing CDK Pipelines easier, NOT to improve the API of the CodePipeline library. Some candidates:


I'm looking for compelling use cases that would require us to support multiple sources and/or multiple builds. Please supply them in this ticket if you have them.

On the deployment front, CDK pipelines will never do anything more than deploying CDK apps using CloudFormation, and running some limited set of validations.

straygar commented 3 years ago

Hey!

Great proposal! Coming from the Amazon world myself, I would love to see CDK Pipelines come closer to this model. :)

One use-case I have for multiple sources-builds is deploying a CDK app with a Java Lambda, given:

I would find it nice to have separate source/build triggers for each one of those packages in the pipeline, unless there is a cleaner way to do so. The B repo build in particular could be a code-build project, which produces a JAR. This JAR artifact can then be replicated to an S3 bucket in each stage/account the pipeline deploys to.

Maybe there's a more idiomatic CDK solution to replicate a JAR artifact across multiple stages and using that as the Lambda code source, but I was unable to find a good example of this.

hoegertn commented 3 years ago

Is there a special reason you are having two repos? Imho the CDK way is to have the Java lambda inside the same repo and reference it as an asset. It will then be built once and the jar file will be used in all stages.

straygar commented 3 years ago

@hoegertn "The CDK way" - this is the first time I've heard of this, to be honest. Are there docs defining this as a best practice?

But I guess my reasoning would be separation of concerns - it'd be great to avoid dumping code of different languages, environments and build systems (Java, Scala, Python, TypeScript) in the same codebase. That sounds a bit messy to me (just a personal preference), but maybe I'm not familiar with some tooling that would make it nicer. (but part of the reason is historic - this Lambda repo was never a part of a CDK application or any CI/CD process, which I'm trying to change).

So your preferred project structure would be:

In this case, I suppose build-ing the package would do everything - produce any JARs, Python .eggs, .zips; and then CDK would pick those artifacts up as assets and deploy them?

hoegertn commented 3 years ago

I have written a blog post with a sample: https://taimos.de/blog/build-a-basic-serverless-application-using-aws-cdk

I will try to write something more into this ticket later today.

straygar commented 3 years ago

@hoegertn Thanks! I took a brief look. It seems that like with most CDK examples, it assumes my Lambda will be in TypeScript, which is much more straightforward to build, package and deploy from a TypeScript CDK project, than a Java lambda.

Alternatively, I guess you could have multiple projects in the same repo, if you wanted to:

And the infra package would build the other 2 projects prior to deploying. I don't know, to me it feels more natural to allow replicating a CodePipeline artifact to multiple stages. :/

In any case, looking forward for any tips you might have. :)

corymhall commented 3 years ago

@straygar this blog post goes into more detail around how you can build, package, and deploy with the CDK. It uses Golang as an example, but the same concepts can be applied to other languages.

https://aws.amazon.com/blogs/devops/building-apps-with-aws-cdk/

nbaillie commented 3 years ago

@rix0rrr , @hoegertn , Good to see the outline of the direction of travel above. Pipelines is a great addition to CDK.

Currently we have a pattern that can includes some db schema migrations before and some testing after deployment of stages.

Would it be the intention that in this mode of working migrations could occur in "'additional actions' between Prepare and Execute" or would they be before and separate to the AppDeployment within the stage?

For the integration testing we are using jest that runs after the stage is deployed, so hopping that this will still be supported.

Right now for both migration and testing we are using ShellScriptAction, would there be a place for something similar in this new overall approach? Just this part that made me want to ask "We'll reject ... many types of custom actions people might want to insert into the pipeline"

so support something per stage like: [Migrate -> (Prepare -> Deploy) -> IntegrationTest] or: [(Prepare -> Migrate -> Deploy) -> IntegrationTest]

The approval ideas described will address other challenges we have around visibility and control.

hoegertn commented 3 years ago

Is there a special reason you are not doing the DB migrations within your app deployment by using CFN custom resources for example?

nbaillie commented 3 years ago

@hoegertn , Not so much a special reason, but right now we use yarn script to run a typeorm migration and trigger via a shell command in ShellScriptAction. Ideally would be good to keep this however... one of the main reasons i asked is to try and ensure that we are looking at all options, and options that will have a synergy with the project direction. Would be good to see what the overall idea for this type of thing looks like.

rix0rrr commented 3 years ago

@nbaillie to be clear:

"We'll reject ... many types of custom actions people might want to insert into the pipeline"

This was intended to be about locations in the pipeline, not specifically about the action types themselves. Although we won't offer out-of-the-box support for all types of actions, we won't prevent you from adding custom ones. I've updated the comment to reflect that.

But they will have to be in one of a couple of predefined positions:

Right now for both migration and testing we are using ShellScriptAction, would there be a place for something similar in this new overall approach?

Something like ShellScriptAction will definitely still exist afterwards.

mrpackethead commented 3 years ago

"I'm looking for compelling use cases that would require us to support multiple sources and/or multiple builds. Please supply them in this ticket if you have them."

Sorry i'm coming to the party rather late, - This description is copied from a slack chat, sorry if its overly wordy.

I'm working on a project where I'm using cdk pipelines to do a multi-accoutn deployment for dev/test/prod. As part of the stack, i'm deploying containers on ECS.. I have to be able to support container image building, and placement into ECR, when a Github repo is updated. There is multiple images, each with its own github repo. When any of the images are updated through their respective codebuild, I need to do a green blue reployement of that image through the various stages.

As far as i could tell/discover, cdk pipelines dont' support multiple sources.. ( please, if they do, let me know how! ).. so, i had to find a way around this.. I thought the solution, I came up with, to work around this limitation was interesting, and i'd not seen it used before ( it may well have been of course ).

What I did was to add a few extra lines into the buildspec that is used for the image build.

  post_build:
    commands: |
      echo Build completed on $(date)
      echo Pushing the Docker image...
      docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      aws ssm put-parameter --name "$INFRA_PIPELINE/$IMAGE_NAME/CurrentTag" --type "String" --value "$IMAGE_TAG" --overwrite
      aws codepipeline start-pipeline-execution --name $INFRA_PIPELINE

I'm putting the image tag ( which is a commit_id hash ) in a SSM parameter, and then having the code kick the code exectution off. This in effect gets around the limitations of only having a single source for the cdk pipeline

The code and dockerfiles for the various containers come from different repos because the code/repos are owned by different business units, and in one case a different organization entirely. They do allow us to read the repos and to pull from the on commits. ( we have created codestar connections to them ). While it would be nice and simple for all the assets for the project to live in a single place, its never going to happen. This is the first time i've needed to use a 'multi-source' pipeline ( this is my sixth project now with cdk pipelines ) . I'm not sure this is a corner case either, though i can only see the world from my world.

mrpackethead commented 3 years ago

"This means we explicitly give up a bunch of freedom that CodePipeline allows, in order to make a simpler application deployment model. We'll reject adding multiple source stages, multiple builds, and many types of actions in weird locations people might want to insert into the pipeline; if they want to do that, they should drop down to building a complete CodePipeline pipeline."

I read @clareliguori Blog and couldn't find anywhere specifically in that article that said there should only be a single source.?

Edit: Asked Clare what she thinks

clareliguori commented 3 years ago

Hey all! Thanks for pinging me @mrpackethead. In attempting to model how we do pipelines at Amazon in CDK, I think the number of source and build actions is less important than the order of stages in the pipeline and how the deployment actions are laid out.

Amazon pipelines have one single source stage (as seen in the diagram above), but that source stage typically brings in multiple sources (as seen here). For example, pretty much all pipelines internally at least bring in both the 'primary' source code repo and that repo's dependency libraries in the source stage. A simple example to model in CDK would be bringing in the 'primary' source code (GitHub source action in CodePipeline terms) that will be built and deployed, plus a Docker base image (ECR source action in CodePipeline). Last time I checked, CodePipeline pipelines only have a single source stage for all source actions, so that already matches how we model pipelines internally.

After the source stage, we do all the build steps. Using the simple example I gave above, this would mean compiling the source code and packaging it into a Docker image that is pushed into ECR. Since CodePipeline already does the work for you of passing immutable artifacts between the build actions, I'm not especially opinionated about this being a single build action/stage vs split across multiple build actions/stages. Some build steps like replicating the docker image across ECR repos in multiple regions/accounts for deployment are useful to do in individual build actions (one per region/account). The area where we tend to be opinionated internally is simply doing ALL the build steps before ANY of the deployment steps, such that you have a consistent set of artifacts that will be deployed later in the pipeline.

When it comes to deployment stages, there is generally only one deployment action per AZ/region/account/any-other-unit in a wave's stage in an internal pipeline. The key here for us is whether the operator can rollback a deployment in a single step. To use the example of DB migration scripts from @nbaillie above, in a single internal pipeline, we wouldn't run a database migration script in a separate deployment action from deploying the microservice. Then we can't rollback in a single step: the operator has to roll them back in a specific order manually, which can delay recovery and introduces human error. In that case, we would either a) split database migrations into a separate pipeline from the microservice's pipeline, or b) combine them into a single deployment workflow action that deploys and rolls back in the correct order. We have an internal system for doing that, but @hoegertn's suggestion of using a single CFN stack for both DB migration and microservice deployment is functionally equivalent in this case (using source dependency relationships in the stack, you would ensure CFN will deploy and roll back in the correct order).

Hope that helps!

mrpackethead commented 3 years ago

Thanks for the insight @clareliguori , thats really helpful.

@rix0rrr when you said "We'll reject adding multiple source stages", - For clarity did this mean with a single source as well, or woudl this be a single source stage, with one or more sources?

Its quite possible i've mis-understood your intent

twitu commented 3 years ago

Firstly CDK pipelines are awesome and I have really enjoyed using them. I want to add my thoughts on multi-source builds and why it will be a good addition.

A few examples have already been given above regarding polyglot projects. I also frequently encounter separate repos for infrastructure and application code in enterprises. There is a separation of responsibility between development and infrastructure/operations and permissions are also created along these lines.

Multi-source builds will be a great batteries included solution in such cases. The alternatives are workable but have fall short in terms of being seamless -

The pace of development and deployment is a little slower in such cases, there is a hand-off between the development and infrastructure team. By having multi-source builds, the pipelines module will be flexible and unopinionated, allowing all kinds of teams to use it.

As a reverse question, what is the downside of supporting multiple sources in the pipeline?

rix0rrr commented 3 years ago

As a reverse question, what is the downside of supporting multiple sources in the pipeline?

Two downsides spring to mind:

To my mind, running cdk deploy and deploying through the pipeline should do the same thing. Since the source layout will be very different between a CodeBuild job and a local disk checkout (or at least cannot be guaranteed to be the same), build scripts will be more complicated or cannot even be written to work on both.

This is potentially a stronger one: the output of every dependency should always be a construct, that gets published to a package repository. Every published revision has a version number.

Then, in a different package, the application that deploys into the pipeline pulls in that dependency by its version number. An upgrade is a bump of that version number as a commit, a rollback is a revert of that commit.

A single commit in a single repository always completely defines the software that is getting deployed. Contrast this with if you have multiple packages, there is really no one commit you can point to and say "install and build that one to get an exact copy of the software that is now running in prod".


After having written this, both of these have to do with reproducibility.

At Amazon internally we have a different mechanism to ensure reproducible point-in-time builds across multiple packages, but since that mechanism doesn't exist outside our own build system, doing this would give us the same guarantees.

bodokaiser commented 3 years ago

@rix0rrr

The issue appears to be a severe limitation to me and pops up frequently, see, for example, here and here.

  1. What is the outlook on the issue?
  2. Are there any updates?
  3. What are the suggested workarounds?

I was thinking about creating an extra git repository that includes all dependencies (cdk, go-lambda, ...) as git submodule which would make it possible to have a single source and versioning. Unfortunately, recursive cloning is not supported yet in AWS CodePipeline.

rix0rrr commented 3 years ago

The suggested workaround is to use separate pipelines to publish your additional artifacts to package managers (or ECR, or some other artifact store), where they will get a version number, and then refer to the version number from your source.

That way, there is a single commit in a single repository that represents your entire application and it's easy to know what commit caused a particular change and it can easily be rolled back.

gshpychka commented 3 years ago

@rix0rrr how can this be automated so that a dev in the asset repo can push a change and have the infrastructure pipeline (in a single repo) get updated to the new asset version?

bodokaiser commented 3 years ago

@gshpychka I guess you would update your internal package version in your cdk repository. In that sense, your cdk repository is something like a yarn.lock as it specifies which of your internal version work together.

So your workflow would not be

  1. Push commits to an internal git repository
  2. CdkPipeline is triggered and deploys changes

but rather

  1. Push commits to an internal git repository
  2. CodePipeline builds package and publishes to, e.g., S3
  3. Update version of your internal package in the cdk repository
  4. CdkPipeline is triggered and redeploys, e.g., your LambdaStack where the code now points to the newly published S3 repository.

Indeed, the approach guarantees backward compatibility as your older package versions are always available and the code in your cdk repository specifies compatibility among these internal package versions.

What you give up is a "deploy & test" approach. Publishing and deploying a new internal version of some module comes with overhead and you rather want some confidence that your new internal version works as expected.

gshpychka commented 3 years ago

How would I go about automating that, though?

github-actions[bot] commented 3 years ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

sisyphushappy commented 2 years ago

How would I go about automating that, though?

I figured out a hacky way to do this for my service, which deploys an application to ECS Fargate. The trick is to have an AWS custom resource in your application asset pipeline that triggers the infra pipeline by automatically updating the infra pipeline's remote repository. Specifically, you can update a file in the infra pipeline repo with the asset hash of the app's DockerImageAsset.

Your app source code repository should contain 3 things:

  1. Application source code
  2. Dockerfile
  3. A cdk subdirectory

The cdk subdirectory should contain a simple code pipeline that deploys a single stack (DockerStack). The DockerStack builds your application source code in a DockerImageAsset, copies the DockerImageAsset to an ECR repository, and triggers a lambda function that updates a file in your cdk infrastructure git repo with the DockerImageAsset's asset hash. This will trigger your cdk infrastructure code pipeline, which can deploy an ECS container image from the ECR repository using the asset hash.

I created two gists with the DockerStack and the lambda source code that updates the remote GitHub repository (using GitPython):

Please comment if you can think of a more idiomatic way to do this; however, it does work!

wpr7280 commented 1 year ago

Hello, I saw this pull request https://github.com/aws/aws-cdk/pull/12326, which solves this problem, but there is no detailed explanation. Is there any best practice instructions for implementing this process?

image