aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.35k stars 3.77k forks source link

(custom-resources): AwsCustomResource returns no data #25283

Open bobveringa opened 1 year ago

bobveringa commented 1 year ago

Describe the bug

To get create certain resources, we use AWS Custom Resources to fetch details from the SDK. Upgrading to the latest version 2.76 broke all our custom resources that fetch data using the SDK. This only applies to resources using the AwsCustomResource construct.

When attempting to fetch the resources, the following error is returned from CDK.

CustomResource attribute error: Vendor response doesn't contain endpointAddress key in object

Failures occur on all platforms (Mac, Windows, Linux) including CDK Code Pipeline. The previous version 2.63 was working without issue.

Expected Behavior

Normal operation without breaking changes.

Current Behavior

Investigating lambda return values in CloudWatch there is a difference in the returned response.

Return on 2.76

{
    "Status": "SUCCESS",
    "Reason": "OK",
    "PhysicalResourceId": "[HIDDEN]-ats.iot.eu-central-1.amazonaws.com",
    "StackId": "arn:aws:cloudformation:eu-central-1:[HIDDEN]:stack/iot/462a6010-82e4-11ed-ad4c-06513e2530cc",
    "RequestId": "4a7eed3c-296d-4472-aa2a-e367c08ac434",
    "LogicalResourceId": "IoTEndpoint9F0B923E",
    "NoEcho": false,
    "Data": {}
}

Return < 2.76

{
    "Status": "SUCCESS",
    "Reason": "OK",
    "PhysicalResourceId": "[HIDDEN]-ats.iot.eu-central-1.amazonaws.com",
    "StackId": "arn:aws:cloudformation:eu-central-1:[HIDDEN]:stack/iot/462a6010-82e4-11ed-ad4c-06513e2530cc",
    "RequestId": "f99dfa9a-e771-49fd-aa7f-faefc122e921",
    "LogicalResourceId": "IoTEndpoint9F0B923E",
    "NoEcho": false,
    "Data": {
        "apiVersion": null,
        "region": "eu-central-1",
        "endpointAddress": "[HIDDEN]-ats.iot.eu-central-1.amazonaws.com"
    }
}

Reproduction Steps

from aws_cdk import (
    custom_resources as cr,
)

        iot_endpoint = cr.AwsCustomResource(
            self,
            'IoTEndpoint',
            policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
                resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE
            ),
            on_create=cr.AwsSdkCall(
                service='Iot',
                action='describeEndpoint',
                physical_resource_id=cr.PhysicalResourceId.from_response(
                    'endpointAddress'),
                parameters={
                    'endpointType': 'iot:Data-ATS'
                }
            )
        )

        endpoint = iot_endpoint.get_response_field('endpointAddress')

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.76.0 (build 78c411b)

Framework Version

No response

Node.js Version

v16.14.2

OS

Windows

Language

Python

Language Version

Python 3.9.6

Other information

No response

bobveringa commented 1 year ago

After several hours of debugging, it turns out that the on_update method is now required for these custom resource calls. In version 2.69 a change was introduced in https://github.com/aws/aws-cdk/pull/24194 which made some modifications to the way Physical IDs are handled. It includes the line

AwsCustomResource.getResponseField() and .getResponseFieldReference() will not work if the Create and Update APIs don't consistently return the same fields.

However, this caused a breaking change which seems to have gone unnoticed.

pahud commented 1 year ago

Hi @bobveringa

Do you mean prior to 2.69 you don't need to define on_update and you still can get the response data on resource update while in 2.69 you need explicitly define the on_update?

bobveringa commented 1 year ago

In version 2.63 only defining the on_create was sufficient. After updating to 2.76 the on_update also needed to be defined in order for data to be returned. I assume that this change was caused by another change to custom-resources. However, I have not had the time to investigate if this change is caused by 2.69.

These are the 2 implementations I pulled out of our git history. This initial version has had this implementation since at least CDK v2.20.

iot_endpoint = cr.AwsCustomResource(
    self,
    'IoTEndpoint',
    policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
        resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE
    ),
    on_create=cr.AwsSdkCall(
        service='Iot',
        action='describeEndpoint',
        physical_resource_id=cr.PhysicalResourceId.from_response(
            'endpointAddress'),
        parameters={
            'endpointType': 'iot:Data-ATS'
        }
    )
)

endpoint = iot_endpoint.get_response_field('endpointAddress')

This broke updating to 2.76. Implementing the on_update method to the following resolved the issue.

iot_endpoint = cr.AwsCustomResource(
    self,
    'IoTEndpoint',
    policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
        resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE
    ),
    on_create=cr.AwsSdkCall(
        service='Iot',
        action='describeEndpoint',
        physical_resource_id=cr.PhysicalResourceId.from_response(
            'endpointAddress'),
        parameters={
            'endpointType': 'iot:Data-ATS'
        }
    ),
    on_update=cr.AwsSdkCall(
        service='Iot',
        action='describeEndpoint',
        parameters={
            'endpointType': 'iot:Data-ATS'
        }
    )
)

endpoint = iot_endpoint.get_response_field('endpointAddress')
0xdevalias commented 1 year ago

I ran into what sounds like it could be the same problem as this bumping from a VERY old 1.x version (1.32.2) to 1.201.0, which similarly seemed to break my AwsCustomResource in what appears to be it no longer returning response data:

   /**
     * Use the AWS SDK to call get the CloudFrontDistribution with CognitoIdentityServiceProvider::describeUserPoolDomain
     *
     * @see https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_custom-resources.AwsCustomResource.html
     * @see https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CognitoIdentityServiceProvider.html#describeUserPoolDomain-property
     */
    const describeCognitoUserPoolDomain = new AwsCustomResource(
      this,
      'DescribeCognitoUserPoolDomain',
      {
        resourceType: 'Custom::DescribeCognitoUserPoolDomain',
        onCreate: {
          region: 'us-east-1', // TODO: is this required?
          service: 'CognitoIdentityServiceProvider',
          action: 'describeUserPoolDomain',
          parameters: {
            Domain: userPoolDomain.domain,
          },
          physicalResourceId: PhysicalResourceId.of(userPoolDomain.domain),
        },
        policy: AwsCustomResourcePolicy.fromSdkCalls({
          resources: AwsCustomResourcePolicy.ANY_RESOURCE,
        }),
      }
    )
    describeCognitoUserPoolDomain.node.addDependency(userPoolDomain)

    const userPoolDomainDistribution = describeCognitoUserPoolDomain.getResponseField(
      'DomainDescription.CloudFrontDistribution'
    )
    new CfnOutput(this, 'UserPoolDomainDistribution', {
      value: userPoolDomainDistribution,
    })

    new ARecord(this, 'UserPoolDomainAliasRecord', {
      recordName: userPoolDomain.domain,
      target: RecordTarget.fromAlias({
        bind: () => ({
          hostedZoneId: 'Z2FDTNDATAQYW2', // CloudFront Zone ID
          dnsName: userPoolDomainDistribution,
        }),
      }),
      zone,
    })

~I didn't get a chance to look deeper into it to confirm that the response was empty as described here~I found the logs for this, and can confirm it is empty, ~nor did~ I did not try and find the exact version where the change was made that broke this.

2023-05-22T07:06:01.631Z    b3960a5e-646b-495e-a430-e6690e18bce9    INFO    AWS SDK VERSION: 2.1381.0

2023-05-22T07:06:01.632Z    b3960a5e-646b-495e-a430-e6690e18bce9    INFO    Responding {
    "Status": "SUCCESS",
    "Reason": "OK",
    "PhysicalResourceId": "auth.dev.REDACTED",
    "StackId": "arn:aws:cloudformation:us-east-1:REDACTED:stack/REDACTED-Auth/0798a8c0-9f1b-11ea-8c14-0e09773e6f3f",
    "RequestId": "cd1e5598-8ee7-4d78-9d5a-f966d20796ad",
    "LogicalResourceId": "DescribeCognitoUserPoolDomain9D8EB6B4",
    "NoEcho": false,
    "Data": {}
}

In my use case I just ended up working around it by switching to the modern built-in construct equivalent:

    /**
     * Route53 alias record for the UserPoolDomain CloudFront distribution
     *
     * @see https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-route53.ARecord.html
     * @see https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-route53.RecordTarget.html
     * @see https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-route53-targets.UserPoolDomainTarget.html
     */
    new ARecord(this, 'UserPoolDomainAliasRecord', {
      zone,
      target: RecordTarget.fromAlias(new UserPoolDomainTarget(userPoolDomain)),
      recordName: userPoolDomain.domainName,
    })

Which generates this CloudFormation output:

  // ..snip..
  "UserPoolDomainCloudFrontDomainName0B254952": {
   "Type": "Custom::UserPoolCloudFrontDomainName",
   "Properties": {
    "ServiceToken": {
     "Fn::GetAtt": [
      "AWS679f53fac002430cb0da5b7982bd22872D164C4C",
      "Arn"
     ]
    },
    "Create": {
     "Fn::Join": [
      "",
      [
       "{\"service\":\"CognitoIdentityServiceProvider\",\"action\":\"describeUserPoolDomain\",\"parameters\":{\"Domain\":\"",
       {
        "Ref": "UserPoolDomain5479B217"
       },
       "\"},\"physicalResourceId\":{\"id\":\"",
       {
        "Ref": "UserPoolDomain5479B217"
       },
       "\"}}"
      ]
     ]
    },
    "Update": {
     "Fn::Join": [
      "",
      [
       "{\"service\":\"CognitoIdentityServiceProvider\",\"action\":\"describeUserPoolDomain\",\"parameters\":{\"Domain\":\"",
       {
        "Ref": "UserPoolDomain5479B217"
       },
       "\"},\"physicalResourceId\":{\"id\":\"",
       {
        "Ref": "UserPoolDomain5479B217"
       },
       "\"}}"
      ]
     ]
    },
    "InstallLatestAwsSdk": true
   },
   "DependsOn": [
    "UserPoolDomainCloudFrontDomainNameCustomResourcePolicyF374B62C"
   ],
   "UpdateReplacePolicy": "Delete",
   "DeletionPolicy": "Delete",
   "Metadata": {
    "aws:cdk:path": "REDACTED-Auth/UserPoolDomain/CloudFrontDomainName/Resource/Default"
   }
  },
  // ..snip..

Which when runs, successfully returns the expected data:

2023-05-22T08:54:51.114Z    6a6776e2-c370-4ea6-b7cf-c43042545143    INFO    AWS SDK VERSION: 2.1381.0

2023-05-22T08:54:51.794Z    6a6776e2-c370-4ea6-b7cf-c43042545143    INFO    Responding {
    "Status": "SUCCESS",
    "Reason": "OK",
    "PhysicalResourceId": "auth.dev.REDACTED",
    "StackId": "arn:aws:cloudformation:us-east-1:REDACTED:stack/REDACTED-Auth/0798a8c0-9f1b-11ea-8c14-0e09773e6f3f",
    "RequestId": "c9c51457-cbdd-4b42-b1c8-a84ef952d794",
    "LogicalResourceId": "UserPoolDomainCloudFrontDomainName0B254952",
    "NoEcho": false,
    "Data": {
        "apiVersion": null,
        "region": "us-east-1",
        "DomainDescription.UserPoolId": "us-east-1_REDACTED",
        "DomainDescription.AWSAccountId": "REDACTED",
        "DomainDescription.Domain": "auth.dev.REDACTED",
        "DomainDescription.S3Bucket": "aws-cognito-prod-iad-assets",
        "DomainDescription.CloudFrontDistribution": "REDACTED.cloudfront.net",
        "DomainDescription.Version": "REDACTED",
        "DomainDescription.Status": "ACTIVE",
        "DomainDescription.CustomDomainConfig.CertificateArn": "arn:aws:acm:us-east-1:REDACTED:certificate/43628eb7-9b8d-4ab1-85aa-57bcc651a167"
    }
}
davidjmemmett commented 2 months ago

I'm also experiencing the same issue, I'm issuing a KMS replicateKey call using on_create, and it's failing to get the replica ARN:

Error:

Vendor response doesn't contain ReplicaKeyMetadata.Arn attribute in object...

Redacted code:

replica_resource = AwsCustomResource(
    scope=self,
    id=f'kms_replica_{region}',
    on_create=AwsSdkCall(
        service='KMS',
        action='replicateKey',
        parameters={
            'KeyId': self.kms_key.attr_key_id,
            'ReplicaRegion': region,
            'Policy': dumps(self.key_policy),
        },
        physical_resource_id=PhysicalResourceId.of(f'kms-replica-{region}-{self.stack_unique_name_suffix}'),
    ),
    policy=AwsCustomResourcePolicy.from_statements(
        statements=[
            PolicyStatement(
                actions=['kms:*'],
                resources=['*'],
            )
        ],
    )
)
key_replica_arn = replica_resource.get_response_field('ReplicaKeyMetadata.Arn')

AwsCustomResource(
    scope=self,
    id=f'kms_replica_alias_{region}',
    on_create=AwsSdkCall(
        service='KMS',
        action='createAlias',
        parameters={
            'AliasName': 'alias/KMSAlias',
            'TargetKeyId': key_replica_arn,
        },
        physical_resource_id=PhysicalResourceId.of(f'kms-replica-alias-{region}-{self.stack_unique_name_suffix}'),
        region=region,
    ),
    policy=AwsCustomResourcePolicy.from_statements(
        statements=[
            PolicyStatement(
                actions=['kms:CreateAlias'],
                resources=['*'],
            )
        ],
    )
)
pahud commented 2 months ago

For those who is having Vendor response doesn't contain ReplicaKeyMetadata.Arn attribute in object... error, can you deploy with cdk deploy -R that won't roll back on deploy failure and check the cloudwatch logs of the lambda function and share your logs?

For example, you should be able to see logs like this:

[
    "INIT_START Runtime Version: nodejs:18.v28\tRuntime Version ARN: arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd\n",
    "START RequestId: 5ab66fdd-e0d0-4024-aa7d-b2b3456df051 Version: $LATEST\n",
    "2024-04-30T15:43:21.937Z\t5ab66fdd-e0d0-4024-aa7d-b2b3456df051\tINFO\t{\"RequestType\":\"Create\",\"ServiceToken\":\"arn:aws:lambda:us-east-1:deducted:function:dummy-stack2-AWS679f53fac002430cb0da5b7982bd22872D-I2hig52uIdkt\",\"ResponseURL\":\"...\",\"StackId\":\"arn:aws:cloudformation:us-east-1:deducted:stack/dummy-stack2/435fcf60-0708-11ef-85cf-0ee587ffad51\",\"RequestId\":\"36187d1d-4add-4601-9764-b850ee234636\",\"LogicalResourceId\":\"IoTEndpoint9F0B923E\",\"ResourceType\":\"Custom::AWS\",\"ResourceProperties\":{\"ServiceToken\":\"arn:aws:lambda:us-east-1:deducted:function:dummy-stack2-AWS679f53fac002430cb0da5b7982bd22872D-I2hig52uIdkt\",\"InstallLatestAwsSdk\":\"false\",\"Create\":{\"service\":\"Iot\",\"action\":\"describeEndpoint\",\"physicalResourceId\":{\"responsePath\":\"endpointAddress\"},\"parameters\":{\"endpointType\":\"iot:Data-ATS\"},\"logApiResponseData\":true}}}\n",
    "2024-04-30T15:43:23.658Z\t5ab66fdd-e0d0-4024-aa7d-b2b3456df051\tINFO\tAPI response { endpointAddress: 'a2we3h0d2g8ljn-ats.iot.us-east-1.amazonaws.com' }\n",
    "2024-04-30T15:43:23.696Z\t5ab66fdd-e0d0-4024-aa7d-b2b3456df051\tINFO\tResponding {\"Status\":\"SUCCESS\",\"Reason\":\"OK\",\"PhysicalResourceId\":\"a2we3h0d2g8ljn-ats.iot.us-east-1.amazonaws.com\",\"StackId\":\"arn:aws:cloudformation:us-east-1:deducted:stack/dummy-stack2/435fcf60-0708-11ef-85cf-0ee587ffad51\",\"RequestId\":\"36187d1d-4add-4601-9764-b850ee234636\",\"LogicalResourceId\":\"IoTEndpoint9F0B923E\",\"NoEcho\":false,\"Data\":{\"region\":\"us-east-1\",\"endpointAddress\":\"a2we3h0d2g8ljn-ats.iot.us-east-1.amazonaws.com\"}}\n",
    "END RequestId: 5ab66fdd-e0d0-4024-aa7d-b2b3456df051\n",
    "REPORT RequestId: 5ab66fdd-e0d0-4024-aa7d-b2b3456df051\tDuration: 5154.74 ms\tBilled Duration: 5155 ms\tMemory Size: 128 MB\tMax Memory Used: 97 MB\tInit Duration: 181.34 ms\t\n"
]
stijnbrouwers commented 1 month ago

@bobveringa thanks for info! Adding the on_update also solved the issue for me. I was calling the AWS CustomResource with only an on_create. It used to work, but after an update it broke. Adding on_update fixed it for me (I was calling describeUserPoolClient but I think it doesn't really matter which exact endpoint you are calling).

@davidjmemmett, can you try adding on on_update to your code?

davidjmemmett commented 1 month ago

@davidjmemmett, can you try adding on on_update to your code?

There isn't a KMS API call which returns the same response for updates, therefore only on_create works, any further updates fail.