Open Gtofig opened 2 years ago
We received a notification that NodeJS 12 runtime is no longer being supported. This custom Lambda resource uses this runtime so could become an issue in coming months.
"AWS Lambda end of support for Node.js 12"
Is this relevant in prioritising this issue?
Cheers.
So I came across this issue after learning that the AWS-CDK's OpenIdConnectProvider
construct is actually a custom construct and not leveraging the CfnOIDCProvider
resource. I was surprised to find that this construct had a pretty old bug that we ran into years ago on our own custom code to do the same thing.
I have a solution here that has worked for us for a while now, and I've just ported it into a CDK-based deploy. I'd be happy to see this code baked into the CDK:
assets/thumbprint-layer/Dockerfile
FROM public.ecr.aws/sam/build-python3.7
COPY requirements.txt ./
RUN python -m pip install -r requirements.txt -t /opt/python
WORKDIR /
ENTRYPOINT [ "/bin/bash" ]
assets/thumbprint-layer/requirements.txt
boto3
pyOpenSSL
awsretry
certifi
cfn_resource_provider
assets/thumbprint-layer/thumprint.py
import certifi, sys, json, logging, os, urllib, ssl, socket, requests, traceback
from OpenSSL import crypto, SSL
from uuid import uuid4
from cfn_resource_provider import ResourceProvider
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(os.getenv('LOG_LEVEL', 'DEBUG'))
request_schema = {
'$schema': 'http://json-schema.org/draft-04/schema#',
'type': 'object',
'required': ['URL'],
'properties': {
'URL': {'type': 'string', 'Description': 'The URL to hit'},
}
}
class Thumbprint(ResourceProvider):
def __init__(self):
super(Thumbprint, self).__init__()
self.request_schema = request_schema
@property
def url(self):
return self.get('URL')
@property
def openid_config(self):
return requests.get(self.url + '/.well-known/openid-configuration').json()
@property
def jwks_tuple(self):
url = urllib.parse.urlparse(self.openid_config['jwks_uri'])
return (url.hostname, 443)
@property
def root_cert(self):
host, port = self.jwks_tuple
context = SSL.Context(method=SSL.TLSv1_2_METHOD)
context.load_verify_locations(cafile=certifi.where())
conn = SSL.Connection(context, socket=socket.socket(socket.AF_INET, socket.SOCK_STREAM))
conn.settimeout(5)
conn.connect((host, port))
conn.setblocking(1)
conn.do_handshake()
conn.set_tlsext_host_name(host.encode())
return conn.get_peer_cert_chain()[-1]
@property
def thumbprint(self):
return self.root_cert.digest('sha1').decode('utf-8').replace(':','')
def create(self):
try:
self.physical_resource_id = self.thumbprint
except Exception as e:
traceback.print_exc()
self.physical_resource_id = 'could-not-execute'
self.fail('Failed to execute: %s' % e)
def update(self):
# Updates are not possible. Must delete and re-create. :(
self.delete()
self.create()
def delete(self):
self.success('no-op')
# This is the lambda handler call... but when we operate locally, see the
# __main__ below.
def handler(request, context):
return Thumbprint().handle(request, context)
# When testing locally we just gather some data, but make no changes. The point
# of this test (right now) is to verify that we can hit the SSL endpoint, get
# the root certificate information and return it.
if __name__ == '__main__':
print('Beginning test... validating command-line usage first...')
url = sys.argv[1]
if not url:
print(f'Usage: {sys.argv[0]} <EKS Cluster OIDC URL>')
sys.exit(1)
print(f'Will run test against {url} cluster...')
request = {
"RequestType": 'CREATE',
"ResponseURL": "https://httpbin.org/put",
"StackId": "arn:aws:cloudformation:us-west-2:EXAMPLE/stack-name/guid",
"RequestId": "request-%s" % uuid4(),
"ResourceType": "Custom::Resource",
"LogicalResourceId": "MyCustomResource",
"ResourceProperties": {
"URL": url
}
}
print('Creating Thumbprint() object...')
thumbprint = Thumbprint()
thumbprint.set_request(request, {})
print(f'TEST: root_cert thumbprint = {thumbprint.thumbprint}')
const codeDirectory = Code.fromAsset('assets/thumbprint-layer');
const thumbprintLayer: LayerVersion = new LayerVersion(this, 'ThumbprintLayer', {
removalPolicy: RemovalPolicy.DESTROY,
layerVersionName: scope.stackName,
code: Code.fromDockerBuild(codeDirectory.path, {
file: 'Dockerfile',
platform: Architecture.X86_64.dockerPlatform,
imagePath: '/opt'
}),
compatibleArchitectures: [Architecture.X86_64]
});
const handler = new Function(this, 'ThumbprintHandler', {
runtime: Runtime.PYTHON_3_7,
code: codeDirectory,
layers: [thumbprintLayer],
handler: 'thumbprint.handler'
});
handler.node.addDependency(thumbprintLayer);
const thumbprint = new CustomResource(this, 'Thumbprint', {
removalPolicy: RemovalPolicy.RETAIN,
resourceType: 'Custom::Thumbprint',
serviceToken: handler.functionArn,
properties: {
URL: props.cluster.clusterOpenIdConnectIssuerUrl
}
});
thumbprint.node.addDependency(handler);
new CfnOIDCProvider(this, 'OidcProvider', {
thumbprintList: [thumbprint.ref],
clientIdList: ['sts.amazonaws.com'],
url: props.cluster.clusterOpenIdConnectIssuerUrl
});
So I came across this issue after learning that the AWS-CDK's
OpenIdConnectProvider
construct is actually a custom construct and not leveraging theCfnOIDCProvider
resource. I was surprised to find that this construct had a pretty old bug that we ran into years ago on our own custom code to do the same thing.I have a solution here that has worked for us for a while now, and I've just ported it into a CDK-based deploy. I'd be happy to see this code baked into the CDK:
assets/thumbprint-layer/Dockerfile
FROM public.ecr.aws/sam/build-python3.7 COPY requirements.txt ./ RUN python -m pip install -r requirements.txt -t /opt/python WORKDIR / ENTRYPOINT [ "/bin/bash" ]
assets/thumbprint-layer/requirements.txt
boto3 pyOpenSSL awsretry certifi cfn_resource_provider
assets/thumbprint-layer/thumprint.py
import certifi, sys, json, logging, os, urllib, ssl, socket, requests, traceback from OpenSSL import crypto, SSL from uuid import uuid4 from cfn_resource_provider import ResourceProvider logging.basicConfig() logger = logging.getLogger() logger.setLevel(os.getenv('LOG_LEVEL', 'DEBUG')) request_schema = { '$schema': 'http://json-schema.org/draft-04/schema#', 'type': 'object', 'required': ['URL'], 'properties': { 'URL': {'type': 'string', 'Description': 'The URL to hit'}, } } class Thumbprint(ResourceProvider): def __init__(self): super(Thumbprint, self).__init__() self.request_schema = request_schema @property def url(self): return self.get('URL') @property def openid_config(self): return requests.get(self.url + '/.well-known/openid-configuration').json() @property def jwks_tuple(self): url = urllib.parse.urlparse(self.openid_config['jwks_uri']) return (url.hostname, 443) @property def root_cert(self): host, port = self.jwks_tuple context = SSL.Context(method=SSL.TLSv1_2_METHOD) context.load_verify_locations(cafile=certifi.where()) conn = SSL.Connection(context, socket=socket.socket(socket.AF_INET, socket.SOCK_STREAM)) conn.settimeout(5) conn.connect((host, port)) conn.setblocking(1) conn.do_handshake() conn.set_tlsext_host_name(host.encode()) return conn.get_peer_cert_chain()[-1] @property def thumbprint(self): return self.root_cert.digest('sha1').decode('utf-8').replace(':','') def create(self): try: self.physical_resource_id = self.thumbprint except Exception as e: traceback.print_exc() self.physical_resource_id = 'could-not-execute' self.fail('Failed to execute: %s' % e) def update(self): # Updates are not possible. Must delete and re-create. :( self.delete() self.create() def delete(self): self.success('no-op') # This is the lambda handler call... but when we operate locally, see the # __main__ below. def handler(request, context): return Thumbprint().handle(request, context) # When testing locally we just gather some data, but make no changes. The point # of this test (right now) is to verify that we can hit the SSL endpoint, get # the root certificate information and return it. if __name__ == '__main__': print('Beginning test... validating command-line usage first...') url = sys.argv[1] if not url: print(f'Usage: {sys.argv[0]} <EKS Cluster OIDC URL>') sys.exit(1) print(f'Will run test against {url} cluster...') request = { "RequestType": 'CREATE', "ResponseURL": "https://httpbin.org/put", "StackId": "arn:aws:cloudformation:us-west-2:EXAMPLE/stack-name/guid", "RequestId": "request-%s" % uuid4(), "ResourceType": "Custom::Resource", "LogicalResourceId": "MyCustomResource", "ResourceProperties": { "URL": url } } print('Creating Thumbprint() object...') thumbprint = Thumbprint() thumbprint.set_request(request, {}) print(f'TEST: root_cert thumbprint = {thumbprint.thumbprint}')
AWS-CDK Resources
const codeDirectory = Code.fromAsset('assets/thumbprint-layer'); const thumbprintLayer: LayerVersion = new LayerVersion(this, 'ThumbprintLayer', { removalPolicy: RemovalPolicy.DESTROY, layerVersionName: scope.stackName, code: Code.fromDockerBuild(codeDirectory.path, { file: 'Dockerfile', platform: Architecture.X86_64.dockerPlatform, imagePath: '/opt' }), compatibleArchitectures: [Architecture.X86_64] }); const handler = new Function(this, 'ThumbprintHandler', { runtime: Runtime.PYTHON_3_7, code: codeDirectory, layers: [thumbprintLayer], handler: 'thumbprint.handler' }); handler.node.addDependency(thumbprintLayer); const thumbprint = new CustomResource(this, 'Thumbprint', { removalPolicy: RemovalPolicy.RETAIN, resourceType: 'Custom::Thumbprint', serviceToken: handler.functionArn, properties: { URL: props.cluster.clusterOpenIdConnectIssuerUrl } }); thumbprint.node.addDependency(handler); new CfnOIDCProvider(this, 'OidcProvider', { thumbprintList: [thumbprint.ref], clientIdList: ['sts.amazonaws.com'], url: props.cluster.clusterOpenIdConnectIssuerUrl });
How do you import an existing provider arn using this approach?
If anyone is aware of past situations where a construct moved from being supported via a custom resource, to being supported via a CloudFormation resource, these would be helpful in working on this change.
I am wondering, could the change be made so that apps already using this construct could upgrade without getting their provider resource deleted? Is it a requirement that existing providers not be deleted and re-created? If they are going to be deleted, does this change need to go in an alpha package?
Update: It seems from https://github.com/aws/aws-cdk/pull/22573 and https://github.com/aws/aws-cdk/pull/16014 that the upgrade could be handled with a feature flag.
@diranged You said the custom resource contains a pretty old bug that you encountered previously. I see that you have posted code that contains a solution, but what was the bug?
The Custom Resource handler code implements the logic that lets you omit the thumbprints when instantiating your Provider.
The handler code downloads the thumbprint and does some validation on it when a thumbprint is not provided.
I'm not sure how to retain this feature, meaning the ability to omit the thumbprints and have the construct get a thumbprint at deploy time, without leaving the Provider in a Custom Resource.
If we add a feature flag that causes the OIDC Provider to be created via the Cfn resource, for example, IAM_OIDC_PROVIDER_CFN
, and someone enables that feature flag, then I think the thumbprint
prop would no longer be optional, because we would have no way to fetch the thumbprint. The thumbprintList is a required prop for the Cfn resource.
@douglasnaphas I believe the bug we ran into was that the custom CFN provider was picking the wrong CA in the CA bundle to generate the thumbprint from. Instead of picking the root, it picks a SubCA. AWS then changed the SubCA cert on us a year or two ago, and our authentication broke at that point.
The code I have here I think is “better” because we’re only using a custom provider to get the thumbprint ID - and nothing else. The rest is native CFN, which means that as the CfnOIDCProvider
adds more features (and maybe some day adds the thumbprint picking), we can leverage it and eventually retire the custom CFN code.
@douglasnaphas I believe the bug we ran into was that the custom CFN provider was picking the wrong CA in the CA bundle to generate the thumbprint from. Instead of picking the root, it picks a SubCA. AWS then changed the SubCA cert on us a year or two ago, and our authentication broke at that point.
The code I have here I think is “better” because we’re only using a custom provider to get the thumbprint ID - and nothing else. The rest is native CFN, which means that as the
CfnOIDCProvider
adds more features (and maybe some day adds the thumbprint picking), we can leverage it and eventually retire the custom CFN code.
I think it would make sense to break the thumbprint logic out into its own Thumbprint Construct. I have proposed this in the CDK RFC repo.
I don't think the OIDC Provider L1 Cfn resource will ever support thumbprint picking, since the API for OIDC Providers requires the thumbprint to be specified, and L1 Constructs don't have any additional logic. I think it would be better for the thumbprint picking logic to live in the proposed Thumbprint Construct, and for the OIDC Provider L2 Construct to accept a Thumbprint Construct.
@diranged Did https://github.com/aws/aws-cdk/pull/22802 fix the wrong-certificate bug?
Since about a month ago pinning certificate thumbprints with GitHub as OIDC provider is no longer required.
See GitHub's announcement here.
This is what shows up on AWS's side:
Any updates on this ? We had to switch to regular cloudformation templates because of this issue
We received a notification that NodeJS 12 runtime is no longer being supported. This custom Lambda resource uses this runtime so could become an issue in coming months.
"AWS Lambda end of support for Node.js 12"
Is this relevant in prioritising this issue?
Cheers.
I'm facing a similar issue. The custom resource Lambda is on node 16 and I'd like to update to node 18 but I'm not sure how to do that since the Lambda is provisioned automatically. Any idea?
This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.
I have PR ready for review here https://github.com/aws/aws-cdk/pull/28634
It introduces a new OpenIdConnecProvider that uses the CloudFormation resource so that there is no breaking change for the existing construct that uses custom resources.
Looking for suggestions and comments on this. It is my first contribution to aws-cdk. Thank you 🙏
Describe the feature
OpenIdConnectProvider
construct currently creates custom resource lambda and associated resources to create OIDC provider.However, CloudFormation now supports it out of the box: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-iam-oidcprovider.html
The construct can be significantly simplified by moving to use direct CloudFormation resource.
Use Case
Custom resource lambdas are more complex, harder to understand, and reduce visibility into what's being created. Using CloudFormation resource directly would reduce complexity.
Proposed Solution
Switch to
AWS::IAM::OIDCProvider
CloudFormation resourceOther Information
No response
Acknowledgements
CDK version used
2.32.1
Environment details (OS name and version, etc.)
Ubuntu 18