aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.62k stars 3.91k forks source link

aws-eks: integ.eks-cluster integ test failing with import error #27302

Open kishiel opened 1 year ago

kishiel commented 1 year ago

Describe the bug

When executing the integ.eks-cluster test the ProviderframeworkonEvent lambda is failing with an error:

2023-09-26T20:01:22.589Z    undefined   ERROR   Uncaught Exception  
{
    "errorType": "Runtime.ImportModuleError",
    "errorMessage": "Error: Cannot find module './outbound'\nRequire stack:\n- /var/task/cfn-response.js\n- /var/task/framework.js\n- /var/runtime/index.mjs",
    "stack": [
        "Runtime.ImportModuleError: Error: Cannot find module './outbound'",
        "Require stack:",
        "- /var/task/cfn-response.js",
        "- /var/task/framework.js",
        "- /var/runtime/index.mjs",
        "    at _loadUserApp (file:///var/runtime/index.mjs:1061:17)",
        "    at async UserFunction.js.module.exports.load (file:///var/runtime/index.mjs:1093:21)",
        "    at async start (file:///var/runtime/index.mjs:1256:23)",
        "    at async file:///var/runtime/index.mjs:1262:1"
    ]
}

This also occurred when executing the ipv6 cluster test.

Expected Behavior

I expected a cluster to create and the test to run

Current Behavior

Cluster creation fails resulting in a rollback after no expected response returns to the handler before a timeout occurs. This requires the stack to be deleted and the cluster resource to be left behind (although no cluster actually exists)

Reproduction Steps

Change the name of the test (or some other inert change) and execute:

yarn integ test/aws-eks/test/integ.eks-cluster.js --update-on-failed

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.42.0 (build 7d8ef0b)

Framework Version

No response

Node.js Version

v18.16.0

OS

MacOS 13.5.2

Language

Typescript

Language Version

No response

Other information

No response

kishiel commented 1 year ago

Working on this a bit more today. I've had problems in the past with snapshots having sticky assets, so for fun I've deleted the snapshot directory for this test and am running it again. I can see the cluster creating so whatever this issue is it's localized to the snapshot(s). I've got a few more tests to run so I can regenerate them and see if we can just close this issue.

kishiel commented 1 year ago

Deleting the existing snapshot and re-running it passed with no assertions. This makes debugging changes really expensive because it makes me doubt the existing snapshots are actually sane. Is there anything I can do apart from running the tests in advance of a change to help prevent this condition?

peterwoodworth commented 1 year ago

The version you are using is pretty old. Are you still running into this on latest version?

kishiel commented 1 year ago

The version you are using is pretty old. Are you still running into this on latest version?

CDK version? Good question. Let me try upgrading to latest and see what happens.

github-actions[bot] commented 1 year ago

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

go-to-k commented 1 year ago

I also got the same error with integ tests such as cluster-inference, handlers-in-vpc, cluster-bottlerocket-ng, cluster-private-endpoint, etc... in the eks module.

My fork repo was in this commit included in v2.99.0.

The outbound.js may or may not be in the asset directory commited.

Some tests failed even if there is the file in the snapshot directory. After deleting the existing snapshot directory, it seems they can sometimes be successful.