aws-amplify / amplify-cli

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development.
Apache License 2.0
2.81k stars 819 forks source link

Make the CustomAuthTriggerResource timeout configurable #9837

Open nylltz opened 2 years ago

nylltz commented 2 years ago

Is this feature request related to a new or existing Amplify category?

auth, function

Is this related to another service?

No response

Describe the feature you'd like to request

My Amplify CLI version is 7.6.3. I experienced an issue that is similar to #9510 . The custom resource CustomAuthTriggerResource that Amplify CLI generate for configuring Cognito user pool Lambda trigger come with only a 3 seconds Timeout, which is often not enough for responding CloudFormation a SUCCESS signal. When it times out due to network latency, the Amplify generated CloudFormation stack (usually named as "..AuthTriggerCustomLambdaStack") gets stuck in UPDATE_ROLLBACK_FAILED status after 1 hour. As a consequence, user is unable to make any subsequent push using Amplify CLI.

1

In the template {PROJ_DIR}/amplify/backend/auth/test9656353151790fa62d/build/auth-trigger-cloudformation-template.json , we can see this custom resource is used for configuring a Lambda function as Cognito user pool Lambda trigger:

    "CustomAuthTriggerResource": {
      "Type": "Custom::CustomAuthTriggerResourceOutputs",
      "Properties": {
        "ServiceToken": {
          "Fn::GetAtt": [
            "authTriggerFn7FCFA449",
            "Arn"
          ]
        },
        "userpoolId": {
          "Ref": "userpoolId"
        },
        "lambdaConfig": [
          {
            "triggerType": "PreSignUp",
            "lambdaFunctionName": "test9656353151790fa62dPreSignup",
            "lambdaFunctionArn": {
              "Ref": "functiontest9656353151790fa62dPreSignupArn"
            }
          },
          {
            "triggerType": "PreTokenGeneration",
            "lambdaFunctionName": "test9656353151790fa62dPreTokenGeneration",
            "lambdaFunctionArn": {
              "Ref": "functiontest9656353151790fa62dPreTokenGenerationArn"
            }
          }
        ]
      },
      "UpdateReplacePolicy": "Delete",
      "DeletionPolicy": "Delete"
    }

The true trouble maker is the referenced Lambda function authTriggerFn7FCFA449. The inline function receives CloudFormation request and provide a SUCCESS | FAILED response. When Amplify CLI generated this inline function with cfn-response, it configures the function with the default 3 seconds Timeout. User doesn't have a chance to increase the timeout before making amplify push, unless manually change the template file every time .

    "authTriggerFn7FCFA449": {
      "Type": "AWS::Lambda::Function",
      "Properties": {
        "Code": {
          "ZipFile": "const response = require('cfn-response');\nconst aws = require('aws-sdk');\n\nexports.handler = async function (event, context) {\n  try {\n    const userPoolId = event.ResourceProperties.userpoolId;\n    const lambdaConfig = event.ResourceProperties.lambdaConfig;\n    const config = {};\n    const cognitoClient = new aws.CognitoIdentityServiceProvider();\n    const userPoolConfig = await cognitoClient.describeUserPool({ UserPoolId: userPoolId }).promise();\n    const userPoolParams = userPoolConfig.UserPool;\n    // update userPool params\n\n    const updateUserPoolConfig = {\n      UserPoolId: userPoolParams.Id,\n      Policies: userPoolParams.Policies,\n      SmsVerificationMessage: userPoolParams.SmsVerificationMessage,\n      AccountRecoverySetting: userPoolParams.AccountRecoverySetting,\n      AdminCreateUserConfig: userPoolParams.AdminCreateUserConfig,\n      AutoVerifiedAttributes: userPoolParams.AutoVerifiedAttributes,\n      EmailConfiguration: userPoolParams.EmailConfiguration,\n      EmailVerificationMessage: userPoolParams.EmailVerificationMessage,\n      EmailVerificationSubject: userPoolParams.EmailVerificationSubject,\n      VerificationMessageTemplate: userPoolParams.VerificationMessageTemplate,\n      SmsAuthenticationMessage: userPoolParams.SmsAuthenticationMessage,\n      MfaConfiguration: userPoolParams.MfaConfiguration,\n      DeviceConfiguration: userPoolParams.DeviceConfiguration,\n      SmsConfiguration: userPoolParams.SmsConfiguration,\n      UserPoolTags: userPoolParams.UserPoolTags,\n      UserPoolAddOns: userPoolParams.UserPoolAddOns,\n    };\n\n    // removing undefined keys\n    Object.keys(updateUserPoolConfig).forEach(key => updateUserPoolConfig[key] === undefined && delete updateUserPoolConfig[key]);\n\n    /*removing UnusedAccountValidityDays as deprecated\n    InvalidParameterException: Please use TemporaryPasswordValidityDays in PasswordPolicy instead of UnusedAccountValidityDays\n    */\n    if (updateUserPoolConfig.AdminCreateUserConfig && updateUserPoolConfig.AdminCreateUserConfig.UnusedAccountValidityDays) {\n      delete updateUserPoolConfig.AdminCreateUserConfig.UnusedAccountValidityDays;\n    }\n\n    lambdaConfig.forEach(lambda => (config[`${lambda.triggerType}`] = lambda.lambdaFunctionArn));\n    if (event.RequestType == 'Delete') {\n      try {\n        updateUserPoolConfig.LambdaConfig = {};\n        const result = await cognitoClient.updateUserPool(updateUserPoolConfig).promise();\n        console.log('delete response data ' + JSON.stringify(result));\n        await response.send(event, context, response.SUCCESS, {});\n      } catch (err) {\n        console.log(err.stack);\n        await response.send(event, context, response.FAILED, { err });\n      }\n    }\n    if (event.RequestType == 'Update' || event.RequestType == 'Create') {\n      updateUserPoolConfig.LambdaConfig = config;\n      console.log(updateUserPoolConfig);\n      try {\n        const result = await cognitoClient.updateUserPool(updateUserPoolConfig).promise();\n        console.log('createOrUpdate response data ' + JSON.stringify(result));\n        await response.send(event, context, response.SUCCESS, { result });\n      } catch (err) {\n        console.log(err.stack);\n        await response.send(event, context, response.FAILED, { err });\n      }\n    }\n  } catch (err) {\n    console.log(err.stack);\n    await response.send(event, context, response.FAILED, { err });\n  }\n};\n"
        },
        "Role": {
          "Fn::GetAtt": [
            "authTriggerFnServiceRole08093B67",
            "Arn"
          ]
        },
        "Handler": "index.handler",
        "Runtime": "nodejs12.x"
      },
      "DependsOn": [
        "authTriggerFnServiceRoleDefaultPolicyEC9285A8",
        "authTriggerFnServiceRole08093B67"
      ]
    },

Reproduce the issue

To simulate the timeout issue, I can manually add a line await new Promise(resolve => setTimeout(resolve, 5000)); on the above inline code and make a push. The CloudFormation then gets stuck because can't receive a SUCCEE or FAILED signal, and failed to UPDATE_ROLLBACK_FAILED state.

Workaround

After researching it for a couple of days, I find a workaround to unblock the CloudFormation in stuck.

  1. Add one line showing the CloudFormation event at the beginning of the current Lambda function ...authTriggerFn7FCFA449 via Lambda console. For example, console.log("REQUEST RECEIVED:\n" + JSON.stringify(event));
  2. Click the "continue update rollback" button on CloudFormation console as the AWS documentation shows
  3. Watch the CloudWatch log stream of the above Lambda function, note the S3 pre-signed URL, PhysicalResourceId, StackId, LogicalResourceId. An output will look like
    {
    "RequestType": "Update",
    "ServiceToken": "arn:aws:lambda:ap-southeast-2:1234567:function:amplify-test9656353151-dev-1-authTriggerFn7FCFA449-xyz",
    "ResponseURL": "https://cloudformation-custom-resource-response-apsoutheast2.s3-ap-southeast-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aap-sou...00d761385cd62013820dc780bc07d07e021f77",
    "StackId": "arn:aws:cloudformation:ap-southeast-2:1234567:stack/amplify-test9656353151-dev-165513-AuthTriggerCustomLambdaStack-VZC2CVEFA8GA/00876ee0-93a7-11ec-b1d1-xyz",
    "RequestId": "82bf2ca0-061e-4895-ad70-xyz",
    "LogicalResourceId": "CustomAuthTriggerResource",
    "PhysicalResourceId": "2022/02/22/[$LATEST]f72b84eb...54d7ef4",
    "ResourceType": "Custom::CustomAuthTriggerResourceOutputs",
    ...
    }
  4. According to this AWS documentation , manually construct a cURL SUCCESS request based on the observed information. Note that the cURL destination URL should be the the "ResponseURL" above. Since it is a custom resource, you cannot use CloudFormation CLI signal-resource to unblock it. The cURL request looks like:
    $ curl -H 'Content-Type: ''' -X PUT -d '{"Status": "SUCCESS","PhysicalResourceId": "2022/02/22/[$LATEST]f72b84eb...54d7ef4","StackId": "arn:aws:cloudformation:ap-southeast-2:1234567:stack/amplify-test9656353151-dev-165513-AuthTriggerCustomLambdaStack-VZC2CVEFA8GA/00876ee0-93a7-11ec-b1d1-xyz","RequestId": "82bf2ca0-061e-4895-ad70-xyz","LogicalResourceId": "CustomAuthTriggerResource"}' 'https://cloudformation-custom-resource-response-apsoutheast2.s3-ap-southeast-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aap-sou...00d761385cd62013820dc780bc07d07e021f77'
  5. Once you send the SUCCESS request out, CloudFormation stack will restore to UPDATE_ROLLBACK_COMPLETE state, and user can make further push.

Describe the solution you'd like

I believe the issue actually is made by an Amplify CLI design defect, though I can use the above workaround to fix it. I know that I can increase the default Timeout setting by adding a Timeout:300 on the template {PROJ_DIR}/amplify/backend/auth/test9656353151790fa62d/build/auth-trigger-cloudformation-template.json before amplify push

...
        "Handler": "index.handler",
        "Runtime": "nodejs12.x",
        "Timeout": 300

but Amplify CLI will rollback the above template, so the change cannot be saved. The perfect solution will be providing an option when Amplify CLI configure Cognito Lambda trigger, to make the Timeout configurable.

Describe alternatives you've considered

Or, Amplify CLI can simply increase the Timeout longer, e.g: 10 seconds, when it created the Lambda function. The default Timeout 3 seconds is too short to respond CloudFormation.

Additional context

No response

Is this something that you'd be interested in working on?

Would this feature include a breaking change?

josefaidt commented 2 years ago

Hey @nylltz :wave: thanks for raising this! Unfortunately this is not currently supported, and unfortunately this Lambda is not exposed to the override for auth overrides to be a viable workaround. Marking this as a feature request 🙂

nylltz commented 2 years ago

Hi @josefaidt , thank you for taking it as a feature request. May I know the roadmap to implement it? Currently, my client has to update the Amplify generated CFN template before pushing it to Cloud. However, as I said the CFN template rollback so the change can't be saved

josefaidt commented 2 years ago

Hey @nylltz absolutely, and unfortunately I do not have a defined timeline for this feature request. However, if we know the CloudFormation we want to add or override perhaps auth overrides would be a suitable solution.

artanisdesign commented 1 month ago

honestly.. its quite annyoing this sdk still suffers from these issues even after 2 years.. why do we have to research day and night and figure out workaround when we wanna deploy one single cognito lambda trigger. unbelievable.