appleboy / lambda-action

GitHub Action for Deploying Lambda code to an existing function
https://github.com/marketplace/actions/aws-lambda-deploy
MIT License
394 stars 58 forks source link

ResourceNotReady with 0.1.7 #58

Closed marchof closed 1 year ago

marchof commented 1 year ago

Hi, after I saw #46 marked as fixed I tried the latest version 0.1.7.

Now the action hangs for several minutes and fails with the following output:

2023/04/01 10:27:10 ResourceNotReady: exceeded wait attempts
2023/04/01 10:27:10 ResourceNotReady: exceeded wait attempts

Anything I can do from my side to debug or fix this?

appleboy commented 1 year ago

I will take it.

appleboy commented 1 year ago

@marchof Can you show detailed information on how to reproduce the problem?

marchof commented 1 year ago

Hi @appleboy, it is this action: https://github.com/marchof/io.javaalmanac.sandbox/blob/master/.github/workflows/cd.yml#L114

It fails as soon as I use the 0.1.7 tag. This is the effective configuration which is printed to the build log:

  with:
    aws_access_key_id: ***
    aws_secret_access_key: ***
    aws_region: ***
    function_name: jdk-sandbox-17
    image_uri: ***.dkr.ecr.***.amazonaws.com/javaalmanac/sandbox:lambda-latest-17
    publish: true
    memory_size: 0
    timeout: 0
  env:
    ECR_REPOSITORY: javaalmanac/sandbox
    LATEST_TAG: lambda-latest-17
    AWS_DEFAULT_REGION: ***
    AWS_REGION: ***
    AWS_ACCESS_KEY_ID: ***
    AWS_SECRET_ACCESS_KEY: ***
marchof commented 1 year ago

@appleboy Let me know if I can test or debug something.

appleboy commented 1 year ago

@marchof can you also help to try appleboy/lambda-action@v0.1.5 version?

lorenzopolidori commented 1 year ago

I am experiencing the same issue after upgrading to v0.1.7. Build hangs up for about 5 minutes and fails with ResourceNotReady: exceeded wait attempts error. It works successfully with v0.1.5.

- name: Deploy on AWS lambda
   uses: appleboy/lambda-action@v0.1.7
   with:
       aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
       aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
       aws_region: ${{ secrets.AWS_REGION }}
       zip_file: bundle.zip
       function_name: ***
       handler: src/serverless.handler
       memory_size: 256
       timeout: 10
aleon68 commented 1 year ago

@marchof can you also help to try appleboy/lambda-action@v0.1.5 version?

I was test with 0.1.5, receive error ResourceConflictException: The operation cannot be performed at this time. An update is in progress

With 0.1.6 and 0.1.7 get error ResourceNotReady: exceeded wait attempts

appleboy commented 1 year ago

@aleon68 @lorenzopolidori @marchof Please help to try it out.

- uses: appleboy/lambda-action@v0.1.7
+ uses: appleboy/lambda-action@030df7b8106f9a2563919cf647b7aa7c5412a425
  with:
      aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
      aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      aws_region: ${{ secrets.AWS_REGION }}
      zip_file: bundle.zip
      function_name: ***
      handler: src/serverless.handler
      memory_size: 256
      timeout: 10
+     max_attempts: 200

Thanks.

aleon68 commented 1 year ago

I tried again with:

uses: appleboy/lambda-action@030df7b8106f9a2563919cf647b7aa7c5412a425 with: ...... max_attempts: 200

But after more than 6 min still running

appleboy commented 1 year ago

The default duration is five seconds. CI will get fail after 200 * 5 seconds.

aleon68 commented 1 year ago

This is my yml:

uses: appleboy/lambda-action@030df7b8106f9a2563919cf647b7aa7c5412a425 with: aws_access_key_id: ${{ env.AWS_ACCESS }} aws_secret_access_key: ${{ env.AWS_SECRET }} aws_region: ${{ env.AWS_REGION }} s3_bucket: ${{ env.AWS_S3_BUILD }} function_name: ${{ matrix.lambda-name }} zip_file: ${{ matrix.path-file }}/zipFile.zip handler: ${{ matrix.lambda-name }} description: ... environment: ... timeout: 90 memory_size: 512 runtime: go1.x max_attempts: 10

      This fail after 2 min
      2023/04/02 04:16:24 ResourceNotReady: exceeded wait attempts
aleon68 commented 1 year ago

Sorry, not 2 min, 47 seconds on deploy

appleboy commented 1 year ago

Please update the max_attempts to 200 until the resource ready for an update

appleboy commented 1 year ago

@aleon68 I will check the AWS timeout issue.

aleon68 commented 1 year ago

Please update the max_attempts to 200 until the resource ready for an update

Yeah, I tried recently, and after 16 min 45 sec same fail

aleon68 commented 1 year ago

@aleon68 I will check the AWS timeout issue.

Thanks

appleboy commented 1 year ago

@aleon68

Yeah, I tried recently, and after 16 min 45 sec same fail

I think it is the correct timeout value of 1005 seconds. Can you help to update the max_attempts to 500 and test again? I research many posts that the root cause is adjusting the max_attempts value to higher than the default value 60.

reference: https://github.com/hashicorp/packer/issues/6177

aleon68 commented 1 year ago

Let me try

aleon68 commented 1 year ago

Same result After 42m 18s 2023/04/02 05:52:16 ResourceNotReady: exceeded wait attempts 2023/04/02 05:52:16 ResourceNotReady: exceeded wait attempts

appleboy commented 1 year ago

@aleon68 I found the troubleshooting guide:

ResourceNotReadyException

Lambda reclaims network interfaces that aren't in use. This action can place a function in an inactive state. When a function that is inactive is invoked, the function enters a pending state while VPC network access is restored. The first invocation and all others that occur while the function is in a pending state fail and then produce a ResourceNotReadyException error.

To resolve the error, wait until the VPC connection is restored. Then, invoke the Lambda function again.

See https://repost.aws/knowledge-center/lambda-troubleshoot-invoke-error-502-500

aleon68 commented 1 year ago

OK, I understand, but how can solve this? I have all lambdas on a VPC, is need to delete vpc before update?

aleon68 commented 1 year ago

@appleboy this error is related to invoking lambdas, not on the update, or I'm wrong?

appleboy commented 1 year ago

@aleon68 We need to check the lambda function state is Successful not in Failedor InProgress to avoid the problem before updating the configuration again.

appleboy commented 1 year ago

I will try to reproduce the problem.

appleboy commented 1 year ago

I can reproduce the following error:

image

2023/04/02 14:26:19 ResourceConflictException ResourceConflictException: The operation cannot be performed at this time. An update is in progress for resource: arn:aws:lambda:ap-southeast-1:502946233425:function:gorush { RespMetadata: { StatusCode: 409, RequestID: "e6445013-8af6-4586-ba28-68a09bb235a6" }, Message: "The operation cannot be performed at this time. An update is in progress for resource: arn:aws:lambda:ap-southeast-1:502946233425:function:gorush", Type: "User" } 2023/04/02 14:26:19 ResourceConflictException: The operation cannot be performed at this time. An update is in progress for resource: arn:aws:lambda:ap-southeast-1:502946233425:function:gorush { RespMetadata: { StatusCode: 409, RequestID: "e6445013-8af6-4586-ba28-68a09bb235a6" }, Message: "The operation cannot be performed at this time. An update is in progress for resource: arn:aws:lambda:ap-southeast-1:502946233425:function:gorush", Type: "User" }

but can't see the ResourceNotReady error

appleboy commented 1 year ago

@aleon68 Please help to try the following version again.

update correct path

appleboy/lambda-action@2ec8254c30163468edbb35fc776836c6b12494ef
aleon68 commented 1 year ago

Ok, I'll try

aleon68 commented 1 year ago

Is the correct hash?

unable to find version 02ec8254c30163468edbb35fc776836c6b12494ef

appleboy commented 1 year ago

@aleon68 let me try it.

appleboy commented 1 year ago

@aleon68

appleboy/lambda-action@2ec8254c30163468edbb35fc776836c6b12494ef
aleon68 commented 1 year ago

Ok, I'll try

aleon68 commented 1 year ago

It's working now!!!!!!

Thanks a lot @appleboy

appleboy commented 1 year ago

@aleon68 I will bump the new version later. Thanks for helping with the testing.

aleon68 commented 1 year ago

@aleon68 I will bump the new version later. Thanks for helping with the testing.

Thanks to you for the support

appleboy commented 1 year ago

Bump to new version https://github.com/appleboy/lambda-action/releases/tag/v0.1.8

marchof commented 1 year ago

Looks like v0.1.8 does not solve the problem for me:

2023/04/02 12:54:36 ResourceNotReady: exceeded wait attempts
2023/04/02 12:54:36 ResourceNotReady: exceeded wait attempts

My configuration is:

 with:
    aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws_region: ${{ secrets.AWS_REGION }}
    function_name: xxx
    image_uri: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:${{ env.LATEST_TAG }}

The last version which works for me is 1e05c1377056f21ebb2ce69b910bc16b943c2a66

aleon68 commented 1 year ago

Bump to new version https://github.com/appleboy/lambda-action/releases/tag/v0.1.8

With v0.1.8 I get a previous error:

ResourceConflictException: The operation cannot be performed at this time. An update is in progress for resource: ***

And with release 2ec8254c30163468edbb35fc776836c6b12494ef now get same error (ResourceConflictException) using exactly same deploy (this release work before, now is not working)

appleboy commented 1 year ago

@aleon68 @marchof

Please help to try the following commit sha:

appleboy/lambda-action@390dab2546e6c97ca3b94fc5f3863d0e15bec0ee

Post the logs:

2023/04/03 03:07:36 Current State: Active
2023/04/03 03:07:36 Last Update Status: InProgress
2023/04/03 03:07:36 Last Update Status Reason: The function is being created.
2023/04/03 03:07:36 Last Update Status ReasonCode: Creating
aleon68 commented 1 year ago

@appleboy I was tried, but with same failed

image
appleboy commented 1 year ago

@aleon68 try again. I have updated the new version.

aleon68 commented 1 year ago

Ok, I will try

appleboy commented 1 year ago

You will see the log below

image

aleon68 commented 1 year ago

Yeah, I see all logs now:

Run appleboy/lambda-action@390dab2546e6c97ca3b94fc5f3863d0e15bec0ee 2023/04/03 04:25:08 Update function configuration ... 2023/04/03 04:25:08 Current State: Active 2023/04/03 04:25:08 Last Update Status: Successful 2023/04/03 04:25:09 Update function code ... 2023/04/03 04:25:09 Current State: Active 2023/04/03 04:25:09 Last Update Status: InProgress 2023/04/03 04:25:09 Last Update Status Reason: The function is being created. 2023/04/03 04:25:09 Last Update Status ReasonCode: Creating 2023/04/03 04:25:09 Waiting Last Update Status to be successful ...

And work fine!!!!

aleon68 commented 1 year ago

Thanks again @appleboy

appleboy commented 1 year ago

@marchof Please help to try it out. Waiting for your response. Thanks.

marchof commented 1 year ago

@appleboy Sorry for the late answer. The result with appleboy/lambda-action@390dab2546e6c97ca3b94fc5f3863d0e15bec0ee is:

2023/04/03 09:52:25 Update function configuration ...
2023/04/03 09:52:26 AccessDeniedException: User: arn:aws:iam::***:user/javaalmanac-ecr-upload is not authorized to perform: lambda:GetFunctionConfiguration on resource: arn:aws:lambda:***:***:function:jdk-sandbox-16 because no identity-based policy allows the lambda:GetFunctionConfiguration action
    status code: 403, request id: dfe87caf-3775-4348-bf42-e5efb6d09470

I assume additional permissions are now required. Will add them.

appleboy commented 1 year ago

@marchof Thanks for the reminder. I will update the readme.

marchof commented 1 year ago

I tried appleboy/lambda-action@390dab2546e6c97ca3b94fc5f3863d0e15bec0ee again now with the additional permission lambda:GetFunctionConfiguration. It fails after a bit more than 5 minutes with:

2023/04/03 10:06:50 Update function configuration ...
2023/04/03 10:06:51 Current State: Active
2023/04/03 10:06:51 Last Update Status: Successful
2023/04/03 10:06:52 Update function code ...
2023/04/03 10:06:52 Current State: Active
2023/04/03 10:06:52 Last Update Status: InProgress
2023/04/03 10:06:52 Last Update Status Reason: The function is being created.
2023/04/03 10:06:52 Last Update Status ReasonCode: Creating
2023/04/03 10:06:52 Waiting Last Update Status to be successful ...
2023/04/03 10:12:21 ResourceNotReady: exceeded wait attempts
2023/04/03 10:12:21 ResourceNotReady: exceeded wait attempts

Maybe I should mention that my action tries to update an existing function.

appleboy commented 1 year ago

@marchof Can you update the max_attempts to 600 or more to wait for the Last Update Status to be successful?

PS. 600 unit is second.

marchof commented 1 year ago

I tried

max_attempts: 1000

Now the same failure happens after 18min:

2023/04/03 10:21:47 Update function configuration ...
2023/04/03 10:21:47 Current State: Active
2023/04/03 10:21:47 Last Update Status: Successful
2023/04/03 10:21:48 Update function code ...
2023/04/03 10:21:48 Current State: Active
2023/04/03 10:21:48 Last Update Status: InProgress
2023/04/03 10:21:48 Last Update Status Reason: The function is being created.
2023/04/03 10:21:48 Last Update Status ReasonCode: Creating
2023/04/03 10:21:48 Waiting Last Update Status to be successful ...
2023/04/03 10:40:09 ResourceNotReady: exceeded wait attempts
2023/04/03 10:40:09 ResourceNotReady: exceeded wait attempts