aws / aws-toolkit-vscode

Amazon Q, CodeCatalyst, Local Lambda debug, SAM/CFN syntax, ECS Terminal, AWS resources
https://marketplace.visualstudio.com/items?itemName=AmazonWebServices.amazon-q-vscode
Apache License 2.0
1.51k stars 436 forks source link

(Rancher) SAM debug fails #3413

Open ryancabanas opened 1 year ago

ryancabanas commented 1 year ago

Problem

I'm having trouble getting the "AWS SAM: API Gateway lambda invoke" debug config to make it to a breakpoint, using even a simple app, before throwing an error.

I set up a simple SAM application using the sam init tool. I chose the "Hello World Example" Quick Start Template using Node.js 16 with the ZIP package type. ("No" to XRay and CloudWatch.) I'm running Node.js 16.20.0. The simple app runs just fine locally using sam local start-api, but when I try to run the debug config, I get the following error:

Debugger attached.
2023-05-03T03:22:47.313Z    undefined   ERROR   Uncaught Exception  {"errorType":"Runtime.ImportModuleError","errorMessage":"Error: Cannot find module 'app'\nRequire stack:\n- /var/runtime/index.mjs","stack":["Runtime.ImportModuleError: Error: Cannot find module 'app'","Require stack:","- /var/runtime/index.mjs","    at _loadUserApp (file:///var/runtime/index.mjs:997:17)","    at async Object.load (file:///var/runtime/index.mjs:1032:21)","    at async start (file:///var/runtime/index.mjs:1195:23)","    at async file:///var/runtime/index.mjs:1201:1"]}
Waiting for the debugger to disconnect...

My launch.json file contains the "straight out of the box" debug config and looks like this.

{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "aws-sam",
      "request": "direct-invoke",
      "name": "API sam-app:HelloWorldFunction (nodejs16.x)",
      "invokeTarget": {
        "target": "api",
        "templatePath": "${workspaceFolder}/template.yaml",
        "logicalId": "HelloWorldFunction"
      },
      "api": {
        "path": "/hello",
        "httpMethod": "get",
        "payload": {
          "json": {}
        }
      },
      "lambda": {
        "runtime": "nodejs16.x"
      }
    }
  ]
}

What am I doing wrong? Thanks!

Expected behavior

The debugger should run without error, stopping at my first breakpoint.

System details (run the AWS: About Toolkit command)

justinmk3 commented 1 year ago

Can you share the template.yaml? Specifically wondering about the architecture.

For troubleshooting, could you try starting over with the AWS: Create Lambda SAM Application command (and try both x86 and arm architecture)? It shouldn't make a difference but eliminates some uncertainty.

ryancabanas commented 1 year ago

@justinmk3

Thanks for replying!

Here is the auto-generated template.yaml from the initial sample app.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  sam-app

  Sample SAM Template for sam-app

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 3
    MemorySize: 128

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: hello-world/
      Handler: app.lambdaHandler
      Runtime: nodejs16.x
      Architectures:
        - x86_64
      Events:
        HelloWorld:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /hello
            Method: get

Outputs:
  # ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
  # Find out more about other implicit resources you can reference within SAM
  # https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
  HelloWorldApi:
    Description: "API Gateway endpoint URL for Prod stage for Hello World function"
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hello/"
  HelloWorldFunction:
    Description: "Hello World Lambda Function ARN"
    Value: !GetAtt HelloWorldFunction.Arn
  HelloWorldFunctionIamRole:
    Description: "Implicit IAM Role created for Hello World function"
    Value: !GetAtt HelloWorldFunctionRole.Arn

I created a couple new sample apps using the VS Code AWS: Create Lambda SAM Application approach. Created both arm64 and x86_64 and chose nodejs16.x (not the (image) selection). Both of these template.yaml files were identical to the one generated by the initial sample app, except for the Architectures property.

Got the same error with these 2 new sample apps, as well.

Thanks for the help!

justinmk3 commented 1 year ago

Can't reproduce the issue using a similar system (sam 1.82, M1 macos 13.3.1).

Another thing to try, for troubleshooting: specify "container build" in the launch config:

            "sam": {
                "containerBuild": true
            },

Maybe the Docker image on your machine is old? Does docker images show image id 524865febf0d ?

REPOSITORY                     TAG               IMAGE ID       CREATED              SIZE
public.ecr.aws/lambda/nodejs   16-rapid-x86_64   6b168c3b0313   About a minute ago   511MB
public.ecr.aws/lambda/nodejs   16-x86_64         524865febf0d   2 weeks ago          494MB

What does docker image history <image-id> show (pass the image id of the image created by sam)?

ryancabanas commented 1 year ago

Thanks again!

Before changing the launch config, these are the docker images on my machine from the public.ecr.aws/lambda/nodejs Repository and I do see the image with id 524865febf0d.

public.ecr.aws/lambda/nodejs              16-rapid-arm64                                7642408178fb   2 hours ago     542MB
public.ecr.aws/lambda/nodejs              18-rapid-x86_64                               1e68f0f37867   12 hours ago    650MB
public.ecr.aws/lambda/nodejs              16-rapid-x86_64                               d404fddbbd23   12 hours ago    511MB
public.ecr.aws/lambda/nodejs              14-rapid-x86_64                               3134ec9a0473   18 hours ago    512MB
public.ecr.aws/lambda/nodejs              18-x86_64                                     157b33bad6c1   2 weeks ago     634MB
public.ecr.aws/lambda/nodejs              14-x86_64                                     77509dc275e9   2 weeks ago     496MB
public.ecr.aws/lambda/nodejs              16-x86_64                                     524865febf0d   2 weeks ago     494MB
public.ecr.aws/lambda/nodejs              16-arm64                                      1b62d62cc3ed   2 weeks ago     526MB

I then added that property you suggested to my launch config for the initial sample app, but I still get the same error. List of images was still the same afterward.

I ran docker image history 524865febf0d and get this.

IMAGE          CREATED       CREATED BY                                      SIZE      COMMENT
524865febf0d   2 weeks ago   ENTRYPOINT [ "/lambda-entrypoint.sh" ]          0B
<missing>      2 weeks ago   ENV LAMBDA_RUNTIME_DIR=/var/runtime             0B
<missing>      2 weeks ago   ENV LAMBDA_TASK_ROOT=/var/task                  0B
<missing>      2 weeks ago   ENV LD_LIBRARY_PATH=/var/lang/lib:/lib64:/us…   0B
<missing>      2 weeks ago   ENV PATH=/var/lang/bin:/usr/local/bin:/usr/b…   0B
<missing>      2 weeks ago   ENV TZ=:/etc/localtime                          0B
<missing>      2 weeks ago   ENV LANG=en_US.UTF-8                            0B
<missing>      2 weeks ago   WORKDIR /var/task                               0B
<missing>      2 weeks ago   ADD file:1765a43cac3f8326f5fab27f2880de29656…   83.4MB
<missing>      2 weeks ago   ADD file:6418cb3069c3137bc029e0d26b93f40c753…   101MB
<missing>      2 weeks ago   ADD file:072730bdc14dcc7275f5c0d3308c68c0c48…   5.59MB
<missing>      2 weeks ago   ADD file:60510b91104c0ab3eb7b7c9625db9f55b8a…   397B
<missing>      2 weeks ago   ADD file:3a21bd334604d6853a238d3103837cdd5b9…   548kB
<missing>      2 weeks ago   ADD file:147bb26b86caae7aa223050d5eba8ece7ed…   303MB
<missing>      2 weeks ago   ARCHITECTURE amd64                              0B

I see that the image id of my 16-rapid-x86_64 image is different from the one you have on your system, if that matters. 🙂

Thanks!

ryancabanas commented 1 year ago

Sorry I didn't think to mention this before (I don't know why it didn't occur to me, honestly 🙁 ), but I'm using Rancher Desktop, instead of Docker Desktop. Is this possibly an/the issue? Using Rancher is a work requirement, at the moment. Thanks.

justinmk3 commented 1 year ago

As long as Rancher provides the Docker HTTP API, I would expect sam to be happy.

ryancabanas commented 1 year ago

Thank you! Let me check out those links and report back. Thanks again!

ryancabanas commented 1 year ago

I looked at the links and didn't find anything that seemed to fit my scenario. In both posts, the concerns that I saw mentioned were regarding Python Lambdas and the requirements.txt file and not Node.js. 🙁

Anything else I should check, or info I could send, that could help? Thanks!

justinmk3 commented 1 year ago

Are you able to install Docker Desktop temporarily, to see if it works?

brew install --cask docker

As a workaround, sam debugging works if you connect to a "Dev Environment" on https://codecatalyst.aws . Or if you use vscode remote-ssh to connect to some other remote linux environment.

ryancabanas commented 1 year ago

Thanks again.

I'll have to look into the latter 2 suggestions you provided. I'm not familiar with them.

As for testing out Docker Desktop temporarily, I can sure try that.

For all of these, though, it will probably be a while (at least a few weeks) before I can try them out, though, as I'm about to go on vacation for 2.5 weeks and have some things I need to finish up first. 🙂 I will be sure to follow-up, though.

Thanks!

justinmk3 commented 1 year ago

Hint from https://github.com/aws/aws-sam-cli/issues/3595#issuecomment-1548712859 :

For the rancher users ...

export DOCKER_HOST="unix://$HOME/.rd/docker.sock"

Or ensure that admin mode in Rancher is turned on (Settings > General > Administrator Access)

ryancabanas commented 1 year ago

Thanks again, @justinmk3 !

Yes. I do have "Administrative Access" turned on and I'm running the latest Rancher Desktop 1.8.1. I just re-downloaded and installed Rancher Desktop 1.8.1 again, just for kicks, but I still run into the same problem when attempting to debug.

I also went ahead and installed Docker Desktop 4.19.0 and I was successfully able to run the debugger in VS Code, so it's looking like things aren't playing well with Rancher Desktop.

Thank you!

justinmk3 commented 1 year ago

Another possible fix https://github.com/aws/aws-sam-cli/issues/3595#issuecomment-1575553210 :

To make sam CLI work with Rancher desktop, you can add the following to the override config file (on Mac it located under ~/Library/Application\ Support/rancher-desktop/lima/_config/override.yaml)

mountType: 9p
mounts:
  - location: "/private/var/folders/"
    9p:
      securityModel: mapped-xattr
      cache: "mmap"

The mount is for build folder

ryancabanas commented 1 year ago

@justinmk3

Thanks. Using the exact code above didn't work for me on my MacOS. Get the same error when attempting to debug.

  • you will need to verify what sam is trying to mount to the container, on mac it's /private/var/folders/. Found in the output of the sam local command.

Can you tell me how to find this information, please? I'm looking in the OUTPUT tab in VS Code when I run the debugger, correct? What string should I be searching for? I can then update the code of the override.yaml file and see if that works.

Thanks!

justinmk3 commented 1 year ago

After trying to debug, in the AWS Toolkit output or logs, look for a --build-dir line like:

2023-06-06 12:07:03 [VERBOSE]: running: (not started) [/opt/homebrew/bin/sam build --debug --build-dir /tmp/aws-toolkit-vscode/vsctkbBTWrr/output ...

However that varies for each run 🤔 (Edit: can use the sam.buildDir launch config field: https://github.com/aws/aws-toolkit-vscode/pull/2061 )

ryancabanas commented 1 year ago

@justinmk3

Thanks again! Found that line, but, yeah it seems like the second to last directory in the path is different on each debug run.

2023-06-08 14:58:55 [INFO]: Command: (not started) [/opt/homebrew/bin/sam build --debug --build-dir /tmp/aws-toolkit-vscode/vsctkfSA52W/output ...
justinmk3 commented 1 year ago

You can control the build dir via the sam.buildDir launch config field: https://github.com/aws/aws-toolkit-vscode/pull/2061

ryancabanas commented 1 year ago

Okay. I went ahead and set the sam.buildDir value in my debug config to:

    "sam": {
         "buildDir": "/tmp/aws-toolkit-vscode/rancher-desktop/"
    }

And I set up my override.yaml to:

mountType: 9p
mounts:
  - location: "/tmp/aws-toolkit-vscode/rancher-desktop/"
    9p:
      securityModel: mapped-xattr
      cache: "mmap"

I restarted Rancher Desktop and tried debugging again, but got the same error. I can see my updated build directory in the debug output, so it's successfully reading from my debug config change, but still no dice.

2023-06-08 17:12:40 [INFO]: Command: (not started) [/opt/homebrew/bin/sam build --debug --build-dir /tmp/aws-toolkit-vscode/rancher-desktop/output ...

Thanks.

justinmk3 commented 1 year ago

maybe the override.yaml needs the .../output part of the path?

ryancabanas commented 1 year ago

That didn't seem to do it, unfortunately. My override.yaml looks like this now:

mountType: 9p
mounts:
  - location: "/tmp/aws-toolkit-vscode/rancher-desktop/output"
    9p:
      securityModel: mapped-xattr
      cache: "mmap"

But I still get the same error. 🙁

ryancabanas commented 1 year ago

Hi @justinmk3,

I have good news to report. I was able to get debugging the API working! 🥳

Without an override.yaml file present, I set the sam.buildDir property of my debug config to .aws-sam/build. Just some folder that would be local to the project, but this seemed like a good directory. When I did this, the debugger successfully stopped at my set breakpoint!

(Note that I did upgrade to Rancher Desktop 1.9.1, but without sam.buildDir set up this way, debugging still failed.)

So, does this seem like possibly a permissions issue?

Glad I have it working once again, though! Thanks for your help!

justinmk3 commented 1 year ago

So, does this seem like possibly a permissions issue?

Not sure, but this adds another reason to eliminate use of the temporary directory anyway: https://github.com/aws/aws-toolkit-vscode/issues/2050

Thanks for the followup!

justinmk3 commented 1 year ago

Troubleshooting steps mentioned in https://github.com/aws/aws-sam-cli/issues/5646#issuecomment-1658320645 :

  1. Use docker context inspect to find the socket location
  2. Export DOCKER_HOST (and start vscode from the same terminal to ensure that it inherits the DOCKER_HOST environment variable)

see also: