Closed eladb closed 11 months ago
Design concept:
We add a method to Stack
with the following implementation (name is meh):
public getPropsForResource(className: string, props: any): any {
return props;
}
Then, we somehow enforce that all L2s will call this method when they are initialized and pass in the props
that were passed to them, along with the full name of the class (i.e. aws-s3.Bucket
).
This will allow users to override this method at the stack level (e.g. their base MyCompanyStack
class) and implement policy around default values. For example, they can enforce that all Buckets are created with encryption:
public getPropsForResource(className: string, props: any): any {
if (className === 'aws-s3.Bucket') {
if (props.encryption && props.encryption !== ENCRYPTED) { throw new Error('buckets must be encrypted'); }
props.encryption = ENCRYPTED;
}
return props; // passthrough
}
Something along those lines... Just an option.
I have tried to create one example of enforcing user to follow organization defaults and if you want you can create new resources or modify existing resource using Aspects. https://github.com/nirvana124/aws-cdk-library-example
@Dzhuneyt Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.
One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the core.Construct.onValidate()
method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.
One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the
core.Construct.onValidate()
method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.
Ha! We recently changed the api for validations and you should be able to just do this:
this.node.addValidation(...);
See: https://docs.aws.amazon.com/cdk/api/latest/docs/constructs.Node.html#addwbrvalidationvalidation
would an upstream base validation aspect to support easily adding these validations to a given construct type be useful for this rfc?
I am thinking something like this (pseudo code-ish):
class ValidatorAspect implements core.IAspect {
type: constructs.IConstructs;
fn: constructs.IValidation
constructor(type: constructs.IConstruct, fn: constructs.IValidation) {
this.type = type
this.fn = fn
}
public visit(node: constructs.IConstruct): void {
const n = node as constructs.Node;
if (node instanceof this.type) {
n.addValidation({
validate: this.fn
});
}
}
}
it's more reactive compared to the getPropsForResource design, but i can see both being useful. I'd be happy to work on adding something like this if there's interest.
@ericzbeard gave an awesome talk to my company a few months ago. He mentioned that companies keep trying to make a CDK L2.5 library. (That's what I wanted to do first too!) He mentioned that this was not recommended. This concept really seems to fix that issue.
In his blog post, he has a section about "Don't use constructs for compliance"
Another common pattern we have seen, particularly among enterprise customers, is creating a collection of construct libraries based on the L2 constructs included with the AWS CDK, with a 1-1 mapping of subclasses. For example, you might create a class called
MyCompanyBucket
that extendss3.Bucket
andMyCompanyFunction
that extendslambda.Function
. Inside those subclasses, you implement your company’s security best practices like encrypting data at rest, or requiring the use of certain security policies. This pattern serves a need, but it comes with a significant drawback: as the ecosystem of AWS CDK constructs, such as AWS Solutions Constructs, grows and becomes more useful, your developer community will be effectively cut off from it if they can’t use code that instantiates the base L2s.Investigate the usage of service control polices and permissions boundaries at the organization level to enforce your security guardrails. Use aspects or tools like CFN Guard to make assertions about the properties of infrastructure elements before deployment.
While, I totally agree that we should use tools like CFN Guard, there is still a need and it sounds like Aspects is not the full answer.
Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.
I want to sell the idea that the current level of upvotes (8) does not reflect the actual need for this concept. From the blog post, this sounds like this might "serve a need" to a "common pattern" but doesn't have the drawbacks and headaches of a company specific wrapper.
I can vouch for @maria-jackson's thoughts above. Currently in my organization, we are struggling to find a good balance. On the one hand, we want to provide the ability for engineers to create infrastructure which follows industry best practice. On the other hand, we want engineers to build infrastructure easily which is safe, secure, and adheres to our organization requirements. For example, one of our organization policies is that no S3 bucket should be public by default. Presently we really have 3 ways to accomplish this:
While cfn-guard would potentially solve this, the downside is that it would likely get you feedback in the pipeline (unless you run it locally). Instead, if we were able to bake these checks into CDK synthing (or just into the native constructs via a hook like onValidate
), we would likely get quicker feedback to the engineer writing the code.
None of these is integral to the solution, but they would be nice to have. Take the pseudo code as an illustration of an idea, not a preferred implementation. (I do not understand CDK or Typescript enough)
I'd really like the reusable library to be really easy to read and look a lot like how you pass CDK parameters. I want it to be very clear what people are getting and very clear to our Security folks what the defaults are.
aws-s3.Bucket.defaults({Encryption: KMS_MANAGED});
aws-ec2.Volume.defaults({encrypted: True});
my-company.AwesomeL3.defaults({Foo: BAR});
As the CDK community expands aws-solutions-constructs, cdk-patterns, and their own L3 libraries, I see a need to provide defaults to L3, not just L2s.
For example, I use aws-apigateway-dynamodb and pretend that it defaults to no encryption. While I could pass the value of the decryption every time to the L3 construct, I don't want to. If the solution just provides L2 defaults, those would be overwritten every time.
I'm also wondering how multiple defaults would work. If my company had C-L2, my BU had B-L2, and my team had T-L2. I'd have things in my BU that I'd want to come before my company, but I'd also like the latest updates from my company.
aws-s3.Bucket.defaults({
Encryption: KMS_MANAGED
}).applyAdditionalDefaults(C-L2);
It would be cool if your cfn-guard rules could generate your CDK defaults.
Like, this could translated to something very specific
AWS::EC2::Volume Encrypted == false
But this could not
AWS::EC2::Volume Size == 101 |OR| AWS::EC2::Volume Size == 99
Thanks for the input, @maria-jackson !
@maria-jackson How does the defaults approach help when using libraries like aws-solutions-constructs? You would have to create each resource manually and pass them into the L3 constructs which certainly loses some of the benefit of those libraries.
The solution below meets "Allow users to specify or install some code that controls the defaults of AWS resources," but not "or applies some policy or validation at the stack or app level." It more meets the Best practices for building company default constructs #3235, but as that is considered a duplicate of this issue I thought it was appropriate to add the discussion here.
Looking at https://github.com/awslabs/aws-solutions-constructs, here's the pattern I'd like the aws-cdk maintainers thoughts on as an alternative to the L2.5 library for default properties for "Defaults & configuration polic[ies]" or "compliant constructs."
_logGroupProps = overrideProps(DefaultLogGroupProps(), logGroupProps);
const logGroup = new logs.LogGroup(scope, _logGroupId, _logGroupProps);
I also like the AWS Solutions Constructs approach (thanks @maria-jackson for the great summary!).
The company could have its own libraries with the various defaults. The security team can own company_cdk_security
, operations team (if the company has operations as a centralized function) can own company_cdk_operations
, and so on. It would be great if CDK could generate these libraries from cfn-guard
rules. Alternatively, the central teams can create simple collections, such as the example below (in Python). Then,cfn-guard
rules will validate the eventual result as part of the deployment process.
I think that another important capability is to layer multiple defaults, and allow the application owners to reset some properties before creating the construct. For example, the defaults for public website might include enforce_ssl = True
. For a specific use case, the application owner can get an exception to turn this off, while keeping the rest of the defaults.
Note: I added non-existing .update(other_props: object)
and .from_props(props: object)
methods to illustrate how the above might look like without using the hacky implementation that I experimented with :). The .from_props(props: object)
method is only needed in Python, since the current implementation uses keyword arguments to pass properties.
Edit: If you want to experiment with that approach, I created a simple example: https://github.com/alexpulver/company-guardrails
company_cdk_security/aws_s3.py
from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
class BucketPropsCollection:
public_access = s3.BucketProps(enforce_ssl=True, public_read_access=True)
__expiration_lifecycle_rule = s3.LifecycleRule(expiration=cdk.Duration.days(30))
retention_policy = s3.BucketProps(lifecycle_rules=[__expiration_lifecycle_rule])
app.py
from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3
# from company_cdk_operations import aws_s3 as operations_s3
# from company_cdk8s_security import deployment as security_deployment
app = cdk.App()
stack = cdk.Stack(app, "Website")
# I am using Python's dict.update([other]) semantics:
# https://docs.python.org/3/library/stdtypes.html#dict.update
bucket_props = security_s3.BucketPropsCollection.public_access
bucket_props.update(security_s3.BucketPropsCollection.retention_policy)
bucket_props.update(s3.BucketProps(versioned=True, enforce_ssl=False))
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)
app.synth()
I really like this discussion, and I agree with the general direction where this is going!
I have a few comments on the details of each of the proposed solutions.
@maria-jackson I think, instead of having an overrideProps()
function, you can take advantage of the fact that DefaultLogGroupProps()
is also a function (BTW, I think calling it defaultLogGroupProps()
is more idiomatic in TypeScript/JavaScript). So, you can have defaultLogGroupProps()
take in an optional LogGroupProps
object. In case LogGroupProps
has any required properties, you can use the Partial
TypeScript type, as in function defaultLogGroupProps(props: Partial<LogGroupProps> = {}) { ...
.
So, the code becomes:
const logGroup = new logs.LogGroup(this, 'LogGroup', defaultLogGroupProps({
retention: logs.RetentionDays.TWO_MONTHS, // we can override any property of defaultLogGroupProps() this way
});
Which I think looks great.
@alexpulver I think you want to make these functions as well - making them class fields can have weird side-effects, as they are mutable.
You don't have the nice Partial
feature from TypeScript, but I believe you can emulate it using keyword arguments:
class BucketPropsCollection:
def public_access_props(**kwargs):
return s3.BucketProps(enforce_ssl=True, public_read_access=True, **kwargs)
And then, in your CDK app:
from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3
app = cdk.App()
stack = cdk.Stack(app, "Website")
bucket = s3.Bucket(stack, "Bucket", security_s3.BucketPropsCollection.public_access_props(
versioned=True,
enforce_ssl=False,
))
app.synth()
I'm sure we can get some nice types here too, but I'm not that familiar with Python to add them 🙂.
Thanks for the feedback! I changed to functions; it makes sense :).
In Python, when specifying the same parameter twice, interpreter bombs. For example, it can be enforce_ssl
as part of **kwargs
besides explicit use by BucketPropsCollection.public_access()
. I couldn't think of an easy way for policy library vendors to work around that. Open to suggestions :).
As I mentioned above, I think we should support layered/chained construction of properties. For example, there could be methods to support compliance with FedRAMP (see example at the end from AWS Config conformance pack sample template), SOC, and other standards. If I need to comply with both FedRAMP and SOC, as a policy library consumer, I want to apply both, and trust the AWS CDK API to do the right/defined thing.
@skinny85 if I iterate a bit on the experience you suggested, it would be something like below - what do you think?
cdk.Props.merge
is an imaginary new class and a static method :). It gets an arbitrary number of props objects. All objects should be of the same type, otherwise it is an error. The method reads props in the first and next instances, then creates a new instance with all values merged. If there is a same prop in both instances, the latter takes precedence. The same then happens for next instance and so on. I implemented an example of this approach in the cdk_props.py module. The reason I suggested cdk.Props
namespace, is because I think that logic can apply to all props classes. That will also allow all props classes to apply whatever logic they have on the props arguments.
Another reason for having the merge
method as part of AWS CDK API is that multiple vendors could produce policy libraries, and might not agree on the same API for merging prop values. I am thinking of something like Managed rules for AWS Web Application Firewall.
from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
from company_cdk_security import aws_s3 as security_s3
app = cdk.App()
stack = cdk.Stack(app, "Website")
bucket_props = cdk.Props.merge(
security_s3.BucketPropsCollection.public_access(),
security_s3.BucketPropsCollection.retention_policy(),
s3.BucketProps(versioned=True, enforce_ssl=False),
)
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)
app.synth()
This way, policy library providers would just define collections of policies, and let the clients to combine them in the desired order.
FedRAMP compliance controls example:
@alexpulver I think a policy library is a great idea. One thing to consider is what happens if two libraries have conflicting guidance. There needs to be some mechanism to set priority and alert users that there is a potential conflict.
As I mentioned above, I think we should support layered/chained construction of properties. For example, there could be methods to support compliance with FedRAMP (see example at the end from AWS Config conformance pack sample template), SOC, and other standards. If I need to comply with both FedRAMP and SOC, as a policy library consumer, I want to apply both, and trust the AWS CDK API to do the right/defined thing.
While not a 100% solution, that's where my team is trying to go with cdk-nag. By using the Rule Packs based on the Config Conformance Packs, users can be alerted when their applications aren't following a specific compliance or set of compliances. Aspects could be used for auto remediation, but it might be better to let users decide what exactly they want to follow or ignore
One thing to consider is what happens if two libraries have conflicting guidance. There needs to be some mechanism to set priority and alert users that there is a potential conflict.
Yes, in my opinion, merge()
should fail if there's a conflict between the provided props instances (meaning, two or more of them provide the same property with values that are not equal).
In the above example, imagine security_s3.BucketPropsCollection.public_access()
had Versioned=True
, but security_s3.BucketPropsCollection.retention_policy()
had Versioned=False
(yes, I understand it's unlikely in this specific case, but this is just an example 😜).
but it might be better to let users decide what exactly they want to follow or ignore
I agree with @dontirun on this one. Allowing the user to resolve conflicts instead of failing is important. There are always exceptions...
@skinny85 I see your point about "surprising" results if we just resolve conflicts by default. The default can be to fail unless the user explicitly asks for overrides. @skinny85 @dontirun what do you think about the following approach?
from company_cdk_security import aws_s3 as security_s3
bucket_props = cdk.Props.merge(
props = [
security_s3.BucketPropsCollection.public_access(),
security_s3.BucketPropsCollection.fedramp_moderate(),
],
overrides = s3.BucketProps(versioned=True, public_read_access=True),
)
Here I am asking to merge two policies - public access and FedRAMP Moderate. Let's assume that public access sets publicReadAccess=True
and FedRAMP Moderate sets it to False
(per the s3-bucket-public-read-prohibited AWS Config managed rule).
Let's say the website is a landing page that can have public access. Using the overrides
parameter, I can explicitly specify what conflicts I want to have resolved and how. I can also specify additional props that are relevant for the website (e.g. versioned
), that are not part of any policy. Any other conflict that doesn't have an override will cause merge()
to fail, stating the open conflicts.
@alexpulver I like it!
@alexpulver In addition to what you have, I would also suggest adding explicit overrides to the synthed CloudFormation metadata/CDK metadata warning. When my customers do security reviews, assessors find that information very valuable.
I experimented with combining cdk-nag and the above approach. Would be glad to have your opinion.
The full implementation of the snippets I use below is in the example app I created for this thread: https://github.com/alexpulver/company-guardrails
Merge function signature:
cdk.Props.merge(props: Sequence[Any], *, overrides: Optional[Any]) -> Any:
app.py
import os
from aws_cdk import core as cdk
from cdk_nag import NIST80053Checks
from deployment import LandingPageFrontend
app = cdk.App()
LandingPageFrontend(
app,
"LandingPageFrontend",
env=cdk.Environment(
account=os.environ["CDK_DEFAULT_ACCOUNT"],
region=os.environ["CDK_DEFAULT_REGION"],
),
)
cdk.Aspects.of(app).add(NIST80053Checks())
app.synth()
website/infrastructure.py
from company_cdk import appsec
from company_cdk import fedramp
bucket_props = cdk.Props.merge(
[
appsec.PropsCollection.aws_s3_bucket_public_access(),
fedramp.PropsCollection.aws_s3_bucket(),
],
overrides=s3.BucketProps(versioned=True, public_read_access=True),
)
The intent in this case is to have my app NIST 800-53 compliant, while still ultimately let users decide what rules they want to implement or suppress. I will use cdk-nag
to run checks on my application, and make sure synth fails if there are non-compliant rules that were not implemented or suppressed. Here, running cdk synth
will lead to the following error:
$ npx cdk synth
[Error at /LandingPageFrontend/Website/Bucket/Resource] NIST.800.53-S3BucketLoggingEnabled: The S3 Bucket does not have server access logs enabled - (Control IDs: AC-2(g), AU-2(a)(d), AU-3, AU-12(a)(c)).
Found errors
The error provides the exact construct path, so I know what resource to look at. I decide to implement the rule guidance. First, I add a logs bucket. Assuming I secured it properly, I can suppress the same check for the logs bucket, because it is the one that keeps the logs for the website bucket:
logs_bucket = s3.Bucket(self, "LogsBucket")
cfn_logs_bucket = cast(s3.CfnBucket, logs_bucket.node.default_child)
cfn_logs_bucket.add_metadata(
"cdk_nag",
{
"rules_to_suppress": [
{
"id": "NIST.800.53-S3BucketLoggingEnabled",
"reason": "This is the logging bucket itself",
},
],
},
)
Next, I decide to apply the company policy to implement NIST 800-53 compliance on the website bucket:
from company_cdk import appsec
from company_cdk import fedramp
from company_cdk import nist80053
bucket_props = cdk.Props.merge(
[
appsec.PropsCollection.aws_s3_bucket_public_access(),
fedramp.PropsCollection.aws_s3_bucket(),
nist80053.PropsCollection.aws_s3_bucket(
server_access_logs_bucket=logs_bucket
),
],
overrides=s3.BucketProps(versioned=True, public_read_access=True),
)
No errors this time! :)
$ npx cdk synth
I like that process! Provides two sets of tools for developers on projects. A way to notify them about properties that aren't set for a standard and an easy way to remediate it.
As a side note you will likely get a cdk-nag
error on that setup once the last pull-request for the NIST 800-53 Pack is finished 😊
I like cdk-nag approach, should be pretty easy to automate enforcement in a pipeline.
Another reason for having the merge method as part of AWS CDK API is that multiple vendors could produce policy libraries
Vendors like AWS?
from aws_cdk_policy import appsec
from aws_cdk_policy import fedramp
from aws_cdk_policy import nist80053
It could be AWS, but I also thought of vendors and solutions like Check Point CloudGuard (former Dome9 is part of it now)
So, this solution
meets "Allow users to specify or install some code that controls the defaults of AWS resources," but not "or applies some policy
or validationat the stack or app level."
I think there might be a few separate concepts to solve this issue:
In previous attempt (https://github.com/aws/aws-cdk-rfcs/pull/163) there was a concept of "Creation Hooks." I'd like to slightly adjust the meaning
This combines with the "PropsCollections" either passing them by the "creation hook" or when creating a new instance.
The concept of creation hooks sounds pretty interesting to me as a generic enforcement mechanism. I think my hope is that CDK application developers mostly don't even realize that they are building a compliant application. It just all happens as automatically as possible.
I think one thing that was attractive with tools like cfn-guard
as that a security team could work on the policies outside of any one project and then we can enforce those policies in our pipelines. This way, individual applications do not need to upgrade/update their CDK applications just to ensure they have the latest policies. Maybe this there is a smart way to ensure an app has the latest policies?
from aws_cdk_policy import appsec from aws_cdk_policy import fedramp from aws_cdk_policy import nist80053
If these policies could also produce AWS Config deployments so you can verify that you are compliant with the actual running AWS Resources in your account. This would be really helpful for audits and getting security certifications.
A Security professional could
They all do different things.
I can tell what developers want:
Allow users to specify or install some code that controls the defaults of AWS resources or applies some policy or validation at the stack or app level.
Maybe it can be part of unit tests? Rather than merge props and overrides Give developers fast feedback, clean templates to use, and also provides evidence in a pipeline.
@OperationalFallacy, I know I love fast feedback! and I'm not sure what you mean by "templated configs."
I think you can combine unit tests, merge props, and overrides to give that fast feedback, if a user knew if they were trying or just accidentally overwriting a ex. NIST control.
Off the cuff, you could combine Fine Grained Assertions with an imported Object Mother that stores the info for your base company standards. Then you could you could write your tests in such a way that use the object mother, but indicate if your new properties overwrite the object mother's or just add the property. That way no surprises.
cfn-guard definitely is better, but not quite as fast of feedback.
By 'templated configs' I mean using higher-level constructs. I might be wrong, but if an org implements a strict compliance inside a construct - they probably wouldn't care about merging properties. It is a black and white choice from the security and architecture point of view, no?
Off the cuff, you could combine Fine Grained Assertions with an imported Object Mother that stores the info for your base company standards
Sure, the question is where effort should be: developing something that "mutates" cdk output so it is compliant. Or developing test library that validates resources are compliant. Developers still have a choice write cdk apps from scratch or use open-source constructs, they will just need to make sure compliance tests pass.
Imho, the second approach would win in enterprise.
@OperationalFallacy, if a higher-level construct solves your use case, then yeah, it's way more straight forward. That concept is known as an L2+ library.
But there are situations where it can get frustrating. @benbridts had a good talk on CDK Day for this concept. And there are some listed in aws-cdk's best practices.
It is a black and white choice from the security and architecture point of view, no?
No, it's not black and white unfortunately. There are policies which should apply to 99% of apps. When you have 100+ apps, there's going to be an app where the risk of not complying to a particular control is worth the reward and an exception gets put in place.
@shawnsparks-work I agree, and probably more than 1%, those should be ok with "choose your own adventure", no? Maybe you can give an example or two?
Ultimately, I think there are a couple of different components to this discussion, which might benefit from splitting it out into multiple topics.
There's a library-level component, like an Aspect-style enforcement or L2.5 constructs. There's also an organization-level component, which would allow an Org to enforce best practices uniformly. I'm working on a formal proposal for the second piece, but to give a high level summary:
This proposal attempts to address the Org component with a new L3 Construct and CLI command. It enables static analysis locally and as part of a CI/CD pipeline, which allow developers and organizations to enforce best-practices for their cloud infrastructure.
The CLI command (cdk check
) synthesizes templates, and POSTs them to an API endpoint (configurable in cdk.json). The API service validates the template, and returns results. The API service is generic, and can be built with any tool (e.g. cfn-nag, cfn-guard, in-house scanner, etc) as long as the API conforms to the specification. Users can also configure CDK to run scans on synth via cdk.json.
The L3 construct is a conforming implementation of the API, and uses cloudformation-guard for analysis.
Thanks @GriffinMB ! How would you suggest to handle exceptions from the enforcement?
Additionally, how can checks/enforcements from multiple domains/vendors be combined? For example, there are operational and security best practices (e.g. Well-Architected pillars) that can be checked for. Additionally, different ISVs might want to provide rule packs, and a project can use several of them.
Exceptions can be handled in a lot of ways ha. We're working on a solution to that and will include it in the full proposal. There's a kind of natural way to do it by encoding exception handling in guard rules, but that's not the solution we're looking at right now since it isn't necessarily as flexible.
Different domains can be managed by rulesets. When the API gets a scan request, it will check the template against all rulesets the user has permissions for. Different Orgs, teams, etc would only have permissions to the rulesets applicable to them. This is something else that'll have more detail in the full proposal, but it would look something like:
import * as check from '@aws-cdk/check';
// Principal with permissions to call the API
const account = new iam.AccountPrincipal(...);
// An API service using API Gateway and Lambda Functions
const validator = new check.Validator(this, 'myValidator');
// Grant the principal permissions to validate any ruleset.
validator.grantValidateAllRulesets(account);
Then, when a principal associated with the account hits the API, their template would be checked against all available rulesets. If you want to only grant permissions to specific rulesets, it would look like this:
// Grant a principal permissions to use the validator API, and check
// their templates against "security-ruleset"
validator.grantValidatorRuleset('security-ruleset', principal)
On the user side of things, if they want to kick off a local scan and only care about a subset of rulesets they have access to, they can use the --ruleset
flag (cdk check --ruleset security-ruleset
).
Also, the idea is that the deployment of this would ideally be in a separate account, where you have a single stack that creates and manages the Validator, Rulesets, and permissions for each.
Users can also add additional Validator endpoints if an org has multiple Validators with different Well Architected Rulesets.
When you say "Aspect-style enforcement" - do you mean checks and surfacing issues, like cdk-nag does, or mutating the application in place without developer control?
I mean something along the lines of this: https://github.com/aws/aws-cdk-rfcs/issues/25#issuecomment-741892198
More generally, I just mean any solution where Construct validation is done on the Construct objects directly, during synthesis or compilation, rather than on build artifacts. I'm trying to differentiate between the (as I see it) two different problem spaces that can be addressed:
Working that the props level might work, but I can't help but notice that what we're looking for is Dependency injection.
We might want to substitute an internal call to new kms.Key()
with something like new mycompany.SecureKmsKey()
(or whatever). For that we would need some IoC container-like mechanism, and never call new kms.Key()
ever again. Instead, we would call something like:
const key = Injector.of(this).obtain(kms.Key);
We now have the ability to
This is strictly more powerful, and convenient APIs that leave the class the same but just override some props defaults could still be built on top of this.
I looked deeper into cfn-guard
, and I think that having cfn-guard validate
command add support for something like a CDKTemplate
type would be sufficient (besides the existing CFNTemplate
type added in 2.0.3 version). The CDKTemplate
type can add the CDK path to the rules evaluation output, making it easier for developers to identify the relevant construct in the source code. I opened an enhancement issue for that (see a shorter summary of the issue below). Beyond that, cfn-guard
rules should probably serve as a source of truth. Given the CDKTemplate
(or similar capability) is available, I would retract my suggestion for a need in an AWS CDK-specific policy-as-code evaluation capability. It should also play nice with CDK Pipelines when used in validation steps.
For an example, look at the following CloudFormation logical ID: APILambdaFunctionServiceRole3284E859
. The AWS CDK generates it for a resource of user management backend (an example CDK app) production environment that is deployed by a CI/CD pipeline. CDK path for that logical ID is UserManagementBackend-Pipeline/UserManagementBackend-Prod/Stateless/API/LambdaFunction/ServiceRole/Resource
. Parts before /API/...
refer to CI/CD pipeline stack, deployment stage and stateless stack, respectively. As an AWS CDK developer, I can quickly understand what source code I should look at to make changes, because I provided these construct IDs when instantiating the construct classes (see deployment.py
snippet below).
I think that can help organizations working with both raw CloudFormation and AWS CDK. It can also be a basis for a service-based solution like @GriffinMB mentioned.
What do you think?
deployment.py
class UserManagementBackend(cdk.Stage):
def __init__(
self,
scope: cdk.Construct,
id_: str,
*,
database_dynamodb_billing_mode: dynamodb.BillingMode,
api_lambda_reserved_concurrency: int,
**kwargs: Any,
):
super().__init__(scope, id_, **kwargs)
stateful = cdk.Stack(self, "Stateful")
database = Database(
stateful, "Database", dynamodb_billing_mode=database_dynamodb_billing_mode
)
stateless = cdk.Stack(self, "Stateless")
api = API(
stateless,
"API",
database_dynamodb_table_name=database.dynamodb_table.table_name,
lambda_reserved_concurrency=api_lambda_reserved_concurrency,
)
@GriffinMB thanks for sharing valuable insights on how you are approaching the problem spaces. A lot of your comments are making sense to me.
What I am confused about is the enforcement part of things.
How I currently understand it:
cdk check
command will scan and return a report on compliance with rulesetsHow is the above workflow different than the CDK Aspects?
cdk synth / cdk deploy
command will allow the Aspects to run validationThe CDK Aspects library can be developed internally at its own pace and the developers are not blocked by it.
Probably I am missing out on something or misunderstanding the intent of the discussion
On the matter of providing defaults, one can provide a custom >=L2 construct with defaults of choice and it is the matter of exposing the resources encapsulated within a construct so that the CDK escape hatches can be leveraged.
On the construct provided on an organisation level
export class MyDummyConstruct extends Construct {
public readonly role: IRole;
public readonly bucket: IBucket;
constructor(scope: Construct, id:string, props: MyDummyProps) {
super(scope, id);
this.role =. new Role(this, "Role", {... can enter the sane defaults ...})
this.bucket = new Bucket(this, "Bucket", {... can enter the sane defaults ...})
}
}
On the consumption side, making use of the escape hatch
const myDummyResource = new MyDummyConstruct(this, "DummyConstruct", {... some props ...})
// The underlying resource is accessed and behavior other than the sane default is applied
myDummyResource.bucket.applyRemovalPolicy(RemovalPolicy.DESTROY)
Additionally, the props for a construct can also be used as a mechanism to override the sane defaults.
Would be helpful to clarify my thoughts if someone can point out the drawbacks and in blind spots in this approach.
My 2 cents on using escape hatches :). Each construct library has a public API. To me, using escape hatches is like using internal implementation details of a library. These are subject to change without notice, because they are not part of the API contract. And this can break my code unexpectedly, so I prefer to avoid this.
I'd rather prefer the library to add the required functionality (e.g. expose the underlying resources through an intentional public API). If the library generates many resources (like ApplicationLoadBalancedFargateService), it will take each consumer a significant effort to understand the underlying construct tree and find the relevant resource to patch. That breaks the intent of a higher-level construct library - encapsulate the details through abstraction to save library consumers time.
I think the organizational constructs should support at least the following properties to avoid being a blocker while putting guardrails in place:
There are additional topics being discussed in this thread, unless I missed something :):
There are additional topics being discussed in this thread, unless I missed something :):
I wonder how much of this could be implemented through the current and possibly extended functionality of the cdk watch
and the AWS Toolkit for VS Code for fast feedback.
If cdk watch
added functionality to read cfn-guard
policy, it could provide fast feedback with those organization specific rules, like it already does with aspects.
The AWS Toolkit for VS Code
could read cfn-guard
policy and autofill those defaults in IaC like cdk, cfn, Terraform, etc on instantiation of the specific resources.
Closing this ticket as it is not a current priority.
(See #163 for a previous attempt at an RFC.)
Description
Allow users to specify or install some code that controls the defaults of AWS resources or applies some policy or validation at the stack or app level.
Progress