aws / aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
https://aws.amazon.com/serverless/sam/
Apache License 2.0
6.5k stars 1.17k forks source link

Bug: sam local invoke does not work in Docker-In-Docker environment #4589

Open ghost opened 1 year ago

ghost commented 1 year ago

Description:

I'm interested in using SAM CLI in our CI/CD pipelines to perform testing of our lambdas. I'm most interested in commands like "local invoke" and "local start-api".

My company uses Gitlab CI for our CI/CD pipelines. In Gitlab, using docker requires a Docker-In-Docker (DIND) configuration. The docker service runs in a separate container from the pipeline runner, with a network alias of 'docker'.

When running "sam local invoke" in a DIND environment, I'm either getting a timeout, or an ImportModuleError.

Steps to reproduce:

I created a Github project that sets up a DIND environment.

Workflow Script: https://github.com/PeterBuschSF/sam-cli-test/blob/main/.github/workflows/aws-sam-cli.yml

It sets up a DIND environment to simulate the run environment we get in Gitlab.

It then runs "sam init" to pull the hello world application for python3.9.

Then I try several different ways of calling "sam local invoke" with different values for container-host and container-host-interface options.

Observed result:

Sometimes it fails with: "Timed out while attempting to establish a connection to the container. You can increase this timeout by setting the SAM_CLI_CONTAINER_CONNECTION_TIMEOUT environment variable. The current timeout is 60.0 (seconds)."

2023-01-1[8](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:9) 15:57:28,531 | Config file location: /var/task/sam-app/samconfig.toml
2023-01-18 15:57:28,531 | Config file '/var/task/sam-app/samconfig.toml' does not exist
2023-01-18 15:57:28,536 | Using SAM Template at /var/task/sam-app/template.yaml
2023-01-18 15:57:28,55[9](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:10) | Using config file: samconfig.toml, config environment: default
2023-01-18 15:57:28,559 | Expand command line arguments to:
2023-01-18 15:57:28,559 | --template_file=/var/task/sam-app/template.yaml --event=events/event.json --function_logical_id=HelloWorldFunction --no_event --layer_cache_basedir=/root/.aws-sam/layers-pkg --container_host=localhost --container_host_interface=[12](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:13)7.0.0.1 
2023-01-18 [15](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:16):57:28,559 | local invoke command is called
2023-01-18 15:57:28,564 | No Parameters detected in the template
2023-01-18 15:57:28,588 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id
2023-01-18 15:57:28,588 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
2023-01-18 15:57:28,589 | 0 stacks found in the template
2023-01-18 15:57:28,589 | No Parameters detected in the template
2023-01-18 15:57:28,608 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id
2023-01-18 15:57:28,608 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
2023-01-18 15:57:28,609 | 2 resources found in the stack 
2023-01-18 15:57:28,609 | Found Serverless function with name='HelloWorldFunction' and CodeUri='hello_world/'
2023-01-18 15:57:28,609 | --base-dir is not presented, adjusting uri hello_world/ relative to /var/task/sam-app/template.yaml
2023-01-18 15:57:28,658 | Found one Lambda function with name 'HelloWorldFunction'
2023-01-18 15:57:28,658 | Invoking app.lambda_handler (python3.9)
2023-01-18 15:57:28,658 | No environment variables found for function 'HelloWorldFunction'
2023-01-18 15:57:28,658 | Loading AWS credentials from session with profile 'None'
2023-01-18 15:57:28,672 | Resolving code path. Cwd=/var/task/sam-app, CodeUri=/var/task/sam-app/hello_world
2023-01-18 15:57:28,672 | Resolved absolute path to code is /var/task/sam-app/hello_world
2023-01-18 15:57:28,672 | Code /var/task/sam-app/hello_world is not a zip/jar file
2023-01-18 15:57:28,695 | Image was not found.
2023-01-18 15:57:28,695 | Removing rapid images for repo public.ecr.aws/sam/emulation-python3.9
Building image.........................................................................................................
2023-01-18 15:57:41,054 | Skip pulling image and use local one: public.ecr.aws/sam/emulation-python3.9:rapid-1.70.0-x86_64.
2023-01-18 15:57:41,054 | Mounting /var/task/sam-app/hello_world as /var/task:ro,delegated inside runtime container
2023-01-18 15:58:41,455 | Cleaning all decompressed code dirs
2023-01-18 15:58:41,455 | Timed out while attempting to establish a connection to the container. You can increase this timeout by setting the SAM_CLI_CONTAINER_CONNECTION_TIMEOUT environment variable. The current timeout is 60.0 (seconds).
2023-01-18 15:58:41,455 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics
2023-01-18 15:58:41,471 | Sending Telemetry: {'metrics': [{'commandRun': {'requestId': '988e0869-e081-4874-a077-8187ca5fcf1d', 'installationId': 'a817f28d-cfa1-4781-b7da-1c5f19774b04', 'sessionId': 'ec85271d-2900-4f1f-8a81-5d23df0eab32', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.9.[16](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:17)', 'samcliVersion': '1.70.0', 'awsProfileProvided': False, 'debugFlagProvided': True, 'region': '', 'commandName': 'sam local invoke', 'metricSpecificAttributes': {'projectType': 'CFN', 'gitOrigin': None, 'projectName': '593ab2ca51e925b9f6c2f258bc55ed5926cf6d2c78239a685d65907e4ec7edd3', 'initialCommit': None}, 'duration': 72896, 'exitReason': 'success', 'exitCode': 0}}]}
2023-01-[18](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:19) 15:58:41,816 | Telemetry response: [20](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:14:21)0

Sometimes it fails with "Error: Runtime.ImportModuleError: Unable to import module 'app': No module named 'app'"

2023-01-1[8](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:9) 15:5[9](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:10):44,019 | Config file location: /var/task/sam-app/samconfig.toml
2023-01-18 15:59:44,019 | Config file '/var/task/sam-app/samconfig.toml' does not exist
2023-01-18 15:59:44,025 | Using SAM Template at /var/task/sam-app/template.yaml
2023-01-18 15:59:44,047 | Using config file: samconfig.toml, config environment: default
2023-01-18 15:59:44,047 | Expand command line arguments to:
2023-01-18 15:59:44,047 | --template_file=/var/task/sam-app/template.yaml --event=events/event.json --container_host=docker --container_host_interface=0.0.0.0 --function_logical_id=HelloWorldFunction --no_event --layer_cache_basedir=/root/.aws-sam/layers-pkg 
2023-01-18 15:59:44,047 | local invoke command is called
2023-01-18 15:59:44,053 | No Parameters detected in the template
2023-01-18 15:59:44,077 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id
2023-01-18 15:59:44,077 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
2023-01-18 15:59:44,078 | 0 stacks found in the template
2023-01-18 15:59:44,078 | No Parameters detected in the template
2023-01-18 15:59:44,097 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id
2023-01-18 15:59:44,097 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
2023-01-18 15:59:44,097 | 2 resources found in the stack 
2023-01-18 15:59:44,097 | Found Serverless function with name='HelloWorldFunction' and CodeUri='hello_world/'
2023-01-18 15:59:44,097 | --base-dir is not presented, adjusting uri hello_world/ relative to /var/task/sam-app/template.yaml
2023-01-18 15:59:44,146 | Found one Lambda function with name 'HelloWorldFunction'
2023-01-18 15:59:44,147 | Invoking app.lambda_handler (python3.9)
2023-01-18 15:59:44,147 | No environment variables found for function 'HelloWorldFunction'
2023-01-18 15:59:44,147 | Loading AWS credentials from session with profile 'None'
2023-01-18 15:59:44,161 | Resolving code path. Cwd=/var/task/sam-app, CodeUri=/var/task/sam-app/hello_world
2023-01-18 15:59:44,161 | Resolved absolute path to code is /var/task/sam-app/hello_world
2023-01-18 15:59:44,161 | Code /var/task/sam-app/hello_world is not a zip/jar file
2023-01-18 15:59:44,192 | Skip pulling image and use local one: public.ecr.aws/sam/emulation-python3.9:rapid-1.70.0-x86_64.
2023-01-18 15:59:44,192 | Mounting /var/task/sam-app/hello_world as /var/task:ro,delegated inside runtime container
2023-01-18 15:59:44,445 | Starting a timer for 3 seconds for function 'HelloWorldFunction'
START RequestId: f6fad0eb-7b75-4be5-ba4f-ba176ca20657 Version: $LATEST
Error:  Runtime.ImportModuleError: Unable to import module 'app': No module named 'app'
Traceback (most recent call last):
END RequestId: f6fad0eb-7b75-4be5-ba4f-ba176ca20657
REPORT RequestId: f6fad0eb-7b75-4be5-ba4f-ba176ca20657  Init Duration: 1.46 ms  Duration: 64.08 ms  Billed Duration: 65 ms  Memory Size: [12](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:13)8 MB  Max Memory Used: 128 MB 
2023-01-18 15:59:44,645 | Cleaning all decompressed code dirs
2023-01-18 15:59:44,645 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics
2023-01-18 15:59:44,661 | Sending Telemetry: {'metrics': [{'commandRun': {'requestId': 'ca9c47db-7579-4574-a6a9-6a9b8fa8f31d', 'installationId': 'a817f28d-cfa1-4781-b7da-1c5f19774b04', 'sessionId': '0ba88388-deb9-4547-9fab-0be24ab27867', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.9.16', 'samcliVersion': '1.70.0', 'awsProfileProvided': False, 'debugFlagProvided': True, 'region': '', 'commandName': 'sam local invoke', 'metricSpecificAttributes': {'projectType': 'CFN', 'gitOrigin': None, 'projectName': '593ab2ca51e925b9f6c2f258bc55ed5926cf6d2c78239a685d65907e4ec7edd3', 'initialCommit': None}, 'duration': 598, 'exitReason': 'success', 'exitCode': 0}}]}
2023-01-18 15:59:45,023 | Telemetry response: 200
2023-01-18 15:59:45,024 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics
2023-01-18 15:59:45,024 | Sending Telemetry: {'metrics': [{'runtimeMetric': {'requestId': '175c3bb2-[13](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:14)6a-4963-b6bd-a7f58300678b', 'installationId': 'a817f28d-cfa1-4781-b7da-1c5f19774b04', 'sessionId': '0ba88388-deb9-4547-9fab-0be24ab27867', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.9.16', 'samcliVersion': '1.70.0', 'runtimes': ['python3.9']}}]}
2023-01-18 [15](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:16):59:45,335 | Telemetry response: 200
{"errorMessage": "Unable to import module 'app': No module named 'app'", "errorType": "Runtime.ImportModuleError", "requestId": "f6fad0eb-7b75-4be5-ba4f-ba[17](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:18)6ca[20](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:16:21)657", "stackTrace": []}

Full output can be found here: https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604

Expected result:

Successful response from sam init

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: public.ecr.aws/sam/build-python3.9:latest
  2. sam --version: SAM CLI, version 1.70.0
  3. AWS region: N/A
{
  "version": "1.70.0",
  "system": {
    "python": "3.9.16",
    "os": "Linux-[5](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:12:6).15.0-1030-azure-x8[6](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:12:7)_64-with-glibc2.26"
  },
  "additional_dependencies": {
    "docker_engine": "20.[10](https://github.com/PeterBuschSF/sam-cli-test/actions/runs/3950602390/jobs/6763349604#step:12:11).22",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  }
}

Add --debug flag to command you are running

qingchm commented 1 year ago

Thanks for opening the issue, we will verify if docker in docker is not supported for local invoke and get back to you

Pivert commented 1 year ago

Hi,

Just spent way to much time to fix my problem that seems very similar to yours. Exact same environment & versions except using another region, and using seashell container instead of DiD.

seashell is «a dev workstation in a container», basically bringing all the tools you need to develop & administrate systems, including aws, docker, and of course sam. The seashell startup script does 2 important things for this process:

So, that's the same case:

Also, if you're having DHCP configuration, the external IP might change, and is not visible from within the container. But I realized that for simple local docker, we do not need to connect to the host external interface, it also works by connecting to the local docker network bridge gateway.

So, if you have jq, and you change the <absolute path to my project on host>, you can run this pretty generic command from within a container on the docker daemon (same host).

sam local invoke \
  -v /<absolute path to SAM project on host>/.aws-sam/build \
  --container-host $(docker network inspect bridge | jq -r '.[0].IPAM.Config[0].Gateway') \
  --container-host-interface 0.0.0.0 \
  --debug
PeterCat12 commented 10 months ago

For anyone trying to run invoke in a DIND context inside gitlab:

My --container-host was always wrong and I tried everything under the sun. I came across a myriad of posts that pointed me to the fact that SAM references a DOCKER_HOST variable. Checking this variable inside my gitlab runner I saw that it was tcp://docker:2375.

Also running docker context ls gave me:

NAME        DESCRIPTION                               DOCKER ENDPOINT     KUBERNETES ENDPOINT   ORCHESTRATOR
default *   Current DOCKER_HOST based configuration   tcp://docker:2375                         swarm

It revealed to me that my --container-host needed to be set docker. This allowed SAM to correctly connect to my docker container.

full command:

sam local invoke --container-host docker  --container-host-interface 0.0.0.0 --debug]