aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.65k stars 3.91k forks source link

(aws_lambda_python_alpha): Docker build happens at import time rather than synth/deploy time #27991

Open anentropic opened 11 months ago

anentropic commented 11 months ago

Describe the bug

I have a python cdk codebase

I added a aws_lambda_python_alpha.PythonFunction to my stack... now as soon as I instantiate the stack containing that Lambda, a docker build gets triggered

Expected Behavior

I would expect that docker build for the Lambda function would only occur during deployment, or at worst during synth phase

docker build at import time feels like a bug

Current Behavior

the docker build occurs at the point that the aws_lambda_python_alpha.PythonFunction gets instantiated

so it happens just by importing the code

in particular it happens when I run unit tests for the cdk codebase, which instantiate a version of the stack multiple times

a) makes my tests very slow 😢 b) it's also noisy - it prints a bunch of output to stderr:

#0 building with "desktop-linux" instance using docker driver

#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s

#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 1.35kB done
#2 DONE 0.0s

#3 [internal] load metadata for public.ecr.aws/sam/build-python3.11:latest
#3 DONE 0.2s

#4 [1/2] FROM public.ecr.aws/sam/build-python3.11@sha256:1468e6000e3d406ab5b32e82ba3358440faf75398eef30aa997e2c8e4ad85b07
#4 DONE 0.0s

#5 [2/2] RUN     python -m venv /usr/app/venv &&     mkdir /tmp/pip-cache &&     chmod -R 777 /tmp/pip-cache &&     pip install --upgrade pip &&     mkdir /tmp/poetry-cache &&     chmod -R 777 /tmp/poetry-cache &&     pip install pipenv==2022.4.8 poetry==1.5.1 &&     rm -rf /tmp/pip-cache/* /tmp/poetry-cache/*
#5 CACHED

#6 exporting to image
#6 exporting layers done
#6 writing image sha256:e1b922db4e93213d2e2d81648a4527a255153724fb3bee8c5d60b52414d89c12 done
#6 naming to docker.io/library/cdk-db7ab25e39b5606291af96999b213f4ca620fbee93139bb8d158e76572cf1431 done
#6 DONE 0.0s

What's Next?
  View a summary of image vulnerabilities and recommendations → docker scout quickview
Bundling asset Website/Repro Lambda/Code/Stage...
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
sending incremental file list
index.py
requirements.txt

sent 227 bytes  received 54 bytes  562.00 bytes/sec
total size is 45  speedup is 0.16

Reproduction Steps

app.py

from aws_cdk import (
    App,
    Environment,
    Stack,
    aws_lambda,
    aws_lambda_python_alpha as python_lambda,
)
from constructs import Construct

class Website(Stack):
    def __init__(
        self,
        scope: Construct,
        id: str,
        **kwargs,
    ):
        super().__init__(scope, id)

        # if I drop into debugger from tests, docker build occurs here:
        python_lambda.PythonFunction(
            self,
            "Repro Lambda",
            entry="infra/assets/repro/src",
            runtime=aws_lambda.Runtime.PYTHON_3_11,
        )

app = App()

Website(
    app,
    "Website",
    env=Environment(
        account="000000000000",  # or a valid account
        region="eu-west-1",
    ),
)

if __name__ == "__main__":
    app.synth()

infra/assets/repro/src/index.py

def handler(event, context):
    return True

infra/assets/repro/src/requirements.txt (empty file)

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.106.1 (build a2e5f65)

Framework Version

2.108.0

Node.js Version

v18.18.0

OS

macOS 14.1

Language

Python

Language Version

3.11.5

Other information

No response

anentropic commented 11 months ago

TBH I'd love to be able to run the tests without needing docker at all ...I don't see why it should be needed until doing the actual deployment

indrora commented 11 months ago

You're right, this should happen at synth time (or just before).

It doesn't look like there's a good workaround to keep this from happening other than not importing a package, but that's not feasible. However, I don't know how wide the effect of this is; if other customers have serious issues with this happening, we can probably raise it from p2 to p1.

anentropic commented 11 months ago

I was able to reproduce it with the minimal code above, so I guess others should be able to too

and if not maybe we have a clue about if there's something weird on my end (I haven't tried to do anything weird, but who knows)

tmokmss commented 11 months ago

One possible solution can be to make BundingOptions.image lazy, i.e. something like getImage: () => DockerImage. and then call the lazy function when a bundling actually happens to get a build image.

https://github.com/aws/aws-cdk/blob/b21ee35161e031473ca2138d4819acd81831c1c3/packages/aws-cdk-lib/core/lib/bundling.ts#L31-L35

We need to find a way to introduce such change without breaking something. I'm willing to work on this :)

anentropic commented 8 months ago

The thing that makes this situation even more annoying is that it doesn't appear to use any caching from the Docker engine... every time I run any cdk command on my stack it has to go and build the docker images from scratch