Open icj217 opened 1 year ago
Thank you for the feedback and the upcoming PR. Can you share a little bit about the proposed solution?
Would this help alleviate the issue of base images being built multiple times?
If I have a Dockerfile with this kind of structure, base
is currently built independently in each asset CodeBuild project.
FROM alpine:latest AS base
# Baseline configuration
FROM base AS container1
# …
FROM base AS container2
# …
@pahud I've taken a look at the source code to understand how the CDK handles building docker images. Here's my understanding:
cdk.out/asset-$hash/...
) and registered with the stack synthesizer (via stack.synthesizer.addDockerImageAsset()
method)
DockerImageAsset
, that means the asset's source directory is copied into the directory aboveTarballImageAsset
, that means the source tarball is copied into the directory aboveTarballImageAsset
, runs docker load
) using the cdk-assets
package. Interactions with the docker CLI are all handled through asset publishing (AssetPublishing.publishAsset()
) using the private Docker
class.It seems like the most logical solution is to create a some kind of "bridge" construct that looks like DockerImageAsset
on the outside but internally ends up behaving like TarballImageAsset
(e.g. DockerImageTarballAsset
). This construct would build the image during synthesis and write the image tarball (using docker save
) out to the "staged" asset directory.
The only issue I see with this solution is that direct interactions with the docker CLI are currently not possible. Is there any reason we couldn't make the cdk-assets
package's Docker
class public or great a more generalized docker CLI interface that is agnostic to the context in which it is invoked (i.e. as part of asset manifest publishing vs being called directly from a construct)?
Please let me know if I'm missing anything or if you have any suggestions on possible solutions here!
Just now building a deployment using DockerImageAsset
for the first time and I too find the behavior unexpected for the reasons outlined by the OP. For us, we run cdk deploy 'prod/*' --app cdk.out/
when a PR has been approved and merged. This means there should be no other building happening nor any other dependencies required for this step other than the cdk
CLI.
If something is being built after a PR has been approved (and expected to be ready for immediately deploy), that's an anti-pattern, especially for those using a CD tool like CodePipeline. I made a similar comment here
Another use case is to be able to not have to rewrite the docker build
logic for local dev that's already implemented in the CDK. I'd want to only have to do it once (eg. build args, dir, etc)
We run these during our publish steps in GitHub Actions. cdk-assets
builds and publishes the docker image.
- name: Synth
shell: bash
working-directory: ${{ inputs.directory }}
run: |-
npx cdk synth ...
- name: Upload cdk assets to AWS
shell: bash
run: |-
npx cdk-assets publish --path ./<StackName>.assets.json
This would be a great benefit for automated/approval workflows. Our automated processes synthesize and save the cdk.out
directory as an artifact and await manual approval processes. However, because the docker builds only occur at deployment time, this can cause two key problems: (1) unexpected failures after the approval process (meaning it has be fixed and approved again) and (2) even if there are no changes in the source, the same docker build can produce different results depending on the time in which it is built (think unpinned dependencies, upstream image tag changes, etc.).
Ideally, we'd like to configure image build to occur beforehand so that the deployed artifacts never change after they're generated, meaning what is approved is definitely what gets deployed and we don't have unexpected failures at deploy time.
We could just avoid using the CDK for managing docker image assets altogether and require that build pipelines build and push docker images separately, but it's a very useful feature we'd like to continue leveraging.
It would already help a lot if the cdk-assets
tool mentioned by @uncledru offered a way to only build and not actually publish. The fact that it doesn't makes it equally unsuited for PR checks and the like, as all these assets would just pile up on S3/ECR without ever being part of a deployment, all the while there is no garbage collection (#64).
Even if it did, one would still have to build again for the actual deploy. So building during synth and saving the (compressed) tarballs into cdk.out
really sounds like the most elegant solution, as pointed out by others here.
Describe the feature
Currently, docker images defined in CDK apps are not built at synthesis time, but rather at deployment time.
The CDK should offer a way to build docker images during synthesis and save them as assets using
docker save
so that asset generation happens entirely at synthesis time.Use Case
The CDK's build behavior for docker images diverges from the observed behavior of other types of assets (e.g.
aws_lambda.AssetCode
) where the asset's output directory (e.g.cdk.out/asset.${hash}/
) contains the "final" contents of the asset (which are simply compressed during deployment).This behavior seems to lead to a couple of undesirable realities/limitations:
cdk.out/<stack>.assets.json
file and are subject to expiration before the image is ever builtProposed Solution
No response
Other Information
No response
Acknowledgements
CDK version used
latest
Environment details (OS name and version, etc.)
MacOS 12.6