aws / aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
https://aws.amazon.com/serverless/sam/
Apache License 2.0
6.5k stars 1.17k forks source link

Hanging or failing command `sam build --use-container` creating LambdaLayer when BuildArchitecture is amd64 #7523

Open rstrahan opened 5 days ago

rstrahan commented 5 days ago

Description:

Hanging or failing command sam build --use-container creating LambdaLayer when BuildArchitecture is amd64

Steps to reproduce:

I created a minimal repro (zipfile attached).

sam-build-test.zip

It fails for arm64. It works for x86_64. To repro simply unzip attached in an EC2 dev instance (eg it can be repro'ed on a new vanilla Cloud 9 - either AL2 or AL2023 - doesn't matter - both fail)

ARM64 In the sam-build-test directory, run

sam build --use-container --template-file template-arm64.yaml 

the first time you run it, it might fail rather than hang:

$ sam build --use-container --template-file template-arm64.yaml

        SAM CLI now collects telemetry to better understand customer needs.

        You can OPT OUT and disable telemetry collection by setting the
        environment variable SAM_CLI_TELEMETRY=0 in your shell.
        Thanks for your help!

        Learn More: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-telemetry.html

Starting Build inside a container                                                                                                                                                                            
Building layer 'TestLayer'                                                                                                                                                                                   

Fetching public.ecr.aws/sam/build-python3.12:latest-arm64 Docker container image........<snip>
Mounting /home/ec2-user/environment/sam-build-test/src as /tmp/samcli/source:ro,delegated, inside runtime container                                                                                          
exec /usr/local/opt/lambda-builders/bin/lambda-builders: exec format error
Builder crashed:                                                                                                                                                                                             

Error: Expecting value: line 1 column 1 (char 0)
Traceback:
  File "click/core.py", line 1078, in main
  File "click/core.py", line 1688, in invoke
<snip>
  File "json/decoder.py", line 337, in decode
  File "json/decoder.py", line 355, in raw_decode

An unexpected error was encountered while executing "sam build".
Search for an existing issue:
https://github.com/aws/aws-sam-cli/issues?q=is%3Aissue+is%3Aopen+Bug%3A%20sam%20build%20-%20JSONDecodeError
Or create a bug report:
https://github.com/aws/aws-sam-cli/issues/new?template=Bug_report.md&title=Bug%3A%20sam%20build%20-%20JSONDecodeError

Unfortunately this error reveals no clues to me.. When I repeat the same command again, it hangs - forever:

$ sam build --use-container --template-file template-arm64.yaml
Starting Build inside a container                                                                                                                                                                            
Building layer 'TestLayer'                                                                                                                                                                                   

Fetching public.ecr.aws/sam/build-python3.12:latest-arm64 Docker container image......
Mounting /home/ec2-user/environment/sam-build-test/src as /tmp/samcli/source:ro,delegated, inside runtime container    

X86_64 In the sam-build-test directory, run

sam build --use-container --template-file template-x86_64.yaml 

Works fine.. The only difference in the two templates is the Architecture

$ diff template-x86_64.yaml template-arm64.yaml 
9c9
<       BuildArchitecture: x86_64
---
>       BuildArchitecture: arm64
14c14
<         - x86_64
---
>         - arm64

So we could probably get past this quickly by changing the Architecture from arm64 to x86_64, but x86_64 has higher runtime cost, and I'd rather get to the root cause for why it's not working now on arm64.

Observed result:

Hangs forever after Mounting command:

$ sam build --use-container --template-file template-arm64.yaml --debug
2024-09-29 18:45:57,128 | No config file found in this directory.                                                                                                                                            
2024-09-29 18:45:57,133 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/home/ec2-user/environment/sam-build-test/samconfig.toml'                                          
2024-09-29 18:45:57,135 | Config file location: /home/ec2-user/environment/sam-build-test/samconfig.toml                                                                                                     
2024-09-29 18:45:57,137 | Config file '/home/ec2-user/environment/sam-build-test/samconfig.toml' does not exist                                                                                              
2024-09-29 18:45:57,167 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/home/ec2-user/environment/sam-build-test/samconfig.toml'                                          
2024-09-29 18:45:57,170 | Using config file: samconfig.toml, config environment: default                                                                                                                     
2024-09-29 18:45:57,171 | Expand command line arguments to:                                                                                                                                                  
2024-09-29 18:45:57,173 | --template_file=/home/ec2-user/environment/sam-build-test/template-arm64.yaml --use_container --mount_with=READ --build_dir=.aws-sam/build --cache_dir=.aws-sam/cache              
2024-09-29 18:45:57,225 | 'build' command is called                                                                                                                                                          
2024-09-29 18:45:57,227 | Starting Build inside a container                                                                                                                                                  
2024-09-29 18:45:57,230 | No Parameters detected in the template                                                                                                                                             
2024-09-29 18:45:57,260 | There is no customer defined id or cdk path defined for resource TestLayer, so we will use the resource logical id as the resource id                                              
2024-09-29 18:45:57,262 | 0 stacks found in the template                                                                                                                                                     
2024-09-29 18:45:57,264 | No Parameters detected in the template                                                                                                                                             
2024-09-29 18:45:57,289 | There is no customer defined id or cdk path defined for resource TestLayer, so we will use the resource logical id as the resource id                                              
2024-09-29 18:45:57,291 | 1 resources found in the stack                                                                                                                                                     
2024-09-29 18:45:57,293 | --base-dir is not presented, adjusting uri ./src relative to /home/ec2-user/environment/sam-build-test/template-arm64.yaml                                                         
2024-09-29 18:45:57,297 | 1 resources found in the stack                                                                                                                                                     
2024-09-29 18:45:57,300 | Instantiating build definitions                                                                                                                                                    
2024-09-29 18:45:57,334 | Same Layer build definition found, adding layer (Previous: LayerBuildDefinition(TestLayer, /home/ec2-user/environment/sam-build-test/src, , b93b0847-9979-4e3c-9501-0dcbaf88e80d,  
python3.12, ['python3.12'], arm64, {}), Current: LayerBuildDefinition(TestLayer, /home/ec2-user/environment/sam-build-test/src, , 51dafd0a-7ebc-4e69-9e1d-93acb3954e9b, python3.12, ['python3.12'], arm64,   
{}), Layer: <samcli.lib.providers.provider.LayerVersion object at 0x7f82f64da110>)                                                                                                                           
2024-09-29 18:45:57,342 | Building layer 'TestLayer'                                                                                                                                                         
2024-09-29 18:45:57,351 | Checking free port on 127.0.0.1:7519                                                                                                                                               

Fetching public.ecr.aws/sam/build-python3.12:latest-arm64 Docker container image......
2024-09-29 18:45:57,475 | Mounting /home/ec2-user/environment/sam-build-test/src as /tmp/samcli/source:ro,delegated, inside runtime container  

Expected result:

It should not fail or hang, but rather it should succeed - as it does with x86_64

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: Amazon Linux 2 or Amazon Linux 2023 (new Cloud9 instance)
  2. sam --version: SAM CLI, version 1.112.0
  3. AWS region: us-east-1
# Paste the output of `sam --info` here
{
  "version": "1.112.0",
  "system": {
    "python": "3.11.3",
    "os": "Linux-5.10.225-213.878.amzn2.x86_64-x86_64-with-glibc2.26"
  },
  "additional_dependencies": {
    "docker_engine": "25.0.6",
    "aws_cdk": "2.159.1 (build c66f4e3)",
    "terraform": "Not available"
  },
  "available_beta_feature_env_vars": [
    "SAM_CLI_BETA_FEATURES",
    "SAM_CLI_BETA_BUILD_PERFORMANCE",
    "SAM_CLI_BETA_TERRAFORM_SUPPORT",
    "SAM_CLI_BETA_RUST_CARGO_LAMBDA"
  ]
}

Add --debug flag to command you are running

hawflau commented 1 day ago

@rstrahan thanks for raising the issue. We will try to reproduce the issue