spinnaker-plugins / aws-lambda-deployment-plugin-spinnaker

Spinnaker plugin to support deployment of AWS Lambda functions via Spinnaker pipelines
Apache License 2.0
23 stars 22 forks source link

Event Source Mapping fails after 1st run #81

Closed eyalle closed 3 years ago

eyalle commented 3 years ago

When adding an ARN for Event Source Mapping, first run creates the function & attaches the Event Source Mapping ARNs successfully, but from 2nd run on - each run fails with the same error:

Exception ( Lambda Verification Task ) The event source arn (" arn:aws:kinesis:us-west-2:123456789123:stream/somDataStream ") and function (" some-function-name ") provided mapping already exists. Please update or delete the existing mapping with UUID e4f3d3d3-d3d3-43d3-3d3d-f3d3d3d3d3d3 (Service: AWSLambda; Status Code: 409; Error Code: ResourceConflictException; Request ID: 93131313-1313-1313-1313-131313131313; Proxy: null) The event source arn (" arn:aws:kinesis:us-west-2:123456789123:stream/somDataStream2 ") and function (" some-function-name ") provided mapping already exists. Please update or delete the existing mapping with UUID c6c4c4c4-c3c4-c4c4-c4c4-c4c4c4c4c4c4 (Service: AWSLambda; Status Code: 409; Error Code: ResourceConflictException; Request ID: 1243aaaa-1cc1-44vv-v4v4-3816a8857e1e; Proxy: null)

successful_first_run rerun_failure


Reproducing

  1. Create a Lambda using the plugin, with a\several Event Source Mapping ARN\s
  2. Re-Run the same pipeline, with same configuration
nimakaviani commented 3 years ago

This requires better triaging.

I tested a Lambda configuration both with SQS and Kinesis Data Streams, with one event source, with v1.0.6 of the plugin and couldn't reproduce the issue.

coleduclos commented 3 years ago

@nimakaviani I've been partnering with @eyalle on this.

From your screenshots, it appears that you are configuring Kinesis data streams. is that correct?

Yes, we've tried both Kinesis streams and DynamoDB streams. Both have the same issue.

Which version of the plugin are you using?

1.0.6

Which task is failing?

Lambda Event Configuration Task -- this smells a lot like #70 because the timing of task was exactly 100 seconds (similar to the results we saw for #70).

What does the configuration of your Lambda deployment look like?

     {
      "account": "${parameters[\"account\"]}",
      "batchsize": 10,
      "cloudProvider": "aws",
      "deadLetterConfig": {
        "targetArn": ""
      },
      "detailName": "",
      "enableLambdaAtEdge": false,
      "envVariables": {},
      "functionName": "${execution.application}-${parameters[\"lambda_function_name\"]}",
      "functionUid": "${parameters[\"lambda_function_name\"]}",
      "handler": "${parameters[\"lambda_handler\"]}",
      "memorySize": 128,
      "name": "Deploy Spinnaker Events Lambda",
      "publish": true,
      "refId": "6",
      "region": "${parameters[\"aws_region\"]}",
      "requisiteStageRefIds": [
        "5"
      ],
      "reservedConcurrentExecutions": 10,
      "role": "${parameters[\"lambda_role_arn\"]}",
      "runtime": "${parameters[\"lambda_runtime\"]}",
      "s3bucket": "${parameters[\"lambda_s3_bucket\"]}",
      "s3key": "${parameters[\"lambda_s3_key\"]}",
      "securityGroupIds": [],
      "stackName": "${parameters[\"environment\"]}",
      "subnetIds": [],
      "tags": {
        "purpose": "Lambda Deployment"
      },
      "timeout": 90,
      "tracingConfig": {
        "mode": "PassThrough"
      },
      "triggerArns": [],
      "type": "Aws.LambdaDeploymentStage"
    }

Did you configure one or multiple event sources?

On separate occasions, I believe we've tried with both one and multiple event sources.

nimakaviani commented 3 years ago

hmmm interesting.

was it against us-east? I wonder if it is again a timeout issue? if so, the fix for #70 should have it covered as well potentially.

coleduclos commented 3 years ago

@nimakaviani unfortunately no, the errors highlighted in the screenshots above occurred in us-west-2

nimakaviani commented 3 years ago

let's see if the latest fix we put in helps there. we can cut a release and try it out. this one should be easier to deploy as it only requires bumping the plugin vesion.

nimakaviani commented 3 years ago

just in the process of releasing 1.0.7, so we can test things out. but this issue could be slightly different, since the issue says attaching event sources fails after the first run. Just checking @coleduclos if the issue you saw was related to 409s and timeouts or whether it had to do with running the pipelines and getting errors due to the existence of the mapping?

coleduclos commented 3 years ago

@nimakaviani Thank you for the quick response! My suspicion is that it is at least partially related to the timeouts. Our team should be testing the newest version of the plugin over the next day or two. Hopefully will have more information soon.

cc @eyalle

eyalle commented 3 years ago

I believe this was caused by caching issues as well, once cache refresh was forced these errors disappeared We're still testing 1.0.7 (rollbacked to 1.0.6) and other changes being performed on the plugin's tasks

eyalle commented 3 years ago

@nimakaviani haven't envountered it so far, closing the issue. Thank you 🙏