aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.38k stars 3.78k forks source link

Asynchronous processing (async/await) in a construct #8273

Closed gabor-s closed 4 years ago

gabor-s commented 4 years ago

:question: General Issue

The Question

Is it possible to do asynchronous processing in a Construct? Is there anything in CDK that will wait for a promise to resolve? It seems that everything is synchronous.
I need to resolve a string value to a path, but it could take some time so implemented it asynchronously.

Options considered:

Aspects

The visit method is synchronous.

Tokens

The resolve method is synchronous.

Create the app asynchronously
async function createApp(): Promise<App> {
    const result = await asyncProcessing();
    const app: App = new App();
    // pass result to every stack, bit ugly
    return app;
}
Context

The same as before, but put result into the context, so don't have to pass it to every stack/construct. Honestly, I don't really like this solution, because the construct is reaching out to some known location to get a value. Like process.env calls that are scattered throughout the code.

Is there any support in CDK for asynchronous processing? Or is it an anti-pattern and I'm doing the wrong way?

Environment

Other information

rrrix commented 4 years ago

I wrap my "main" (entrypoint) in an async function, since I have some similar needs (I pre-fetch some things from Secrets Manager).

I don't use the out of the box directory structure (what comes with cdk init).


async function main() {
  const res = await getMyAsyncThing();
  const app = new App();
  new FooStack(App, 'fooStack');
  ...
}

main();

It works great for me!

eladb commented 4 years ago

We currently do not support this and in general we consider this an anti-pattern. One of the tenets of CDK apps is that given the same source they will always produce the same output (same as a compiler). If you need to perform async operations, it means you are going to the network to consult with an external entity, which by definition means you lose determinism.

File system operations can be done synchronously in node.js, so consulting your local disk is "ok", but bear in mind that this still means that you may end up with non-deterministic outputs which breaks some core assumptions of the framework and considered a bad practice in the operational sense.

One direction we are considering is to open up the context provider framework (see https://github.com/aws/aws-cdk-rfcs/pull/167).

Closing for now.

sblackstone commented 3 years ago

@eladb I mostly agree with you here. But one important corner case is if you want to use node's crypto library. Those functions all use promises / callbacks.

eladb commented 3 years ago

Which functions?

sblackstone commented 3 years ago

@eladb nevermind, I erred. There are sync versions of what I thought was purely async.

Thanks!

twooster commented 3 years ago

Strong disagree here, and a bit annoyed. Sorry.

I need/want a custom asset bundler for Cloudwatch Synthetic event lambdas (due to the directory layout requirements if you have more than a single .js file among other reasons). I could do all of this with rollup in an async function inside of the context of my CDK code, but I will instead be forced to run child_process.spawnSync (as per @aws-cdk/aws-lambda-nodejs's LocalBuilder) or similar such inanity.

It's disingenuous to claim to be protecting users from themselves maybe possibly doing something that could lead to non-deterministic behavior as a means of justifying only allowing sync-flavored methods. The architecture of aws-cdk doesn't permit async, but that's a limitation of its constructor-based design, not an imperative.

eladb commented 3 years ago

Is it possible to use the new asset bundling API to run rollup within a docker container that has all the needed dependencies?

twooster commented 3 years ago

Unfortunately not, tryBundle is synchronous and expects a true/false response, not a Promise. :/ Hence the frustration -- everything is sync, even things that are practically speaking async. I think the only feasible path is to avoid the asset bundling api (and sadly miss out on the approved output directory and hashing/caching) and run async code ahead of the CDK calls and then pass in the resolved build output as a constructor parameter.

(Thank you for taking the time to respond. Sorry for the tone, just hit this wall after hours of digging into the codebase.)

eladb commented 3 years ago

Unfortunately not, tryBundle is synchronous and expects a true/false response, not a Promise. :/

You don't have to implement tryBundle. It can simply return false and then fall back to docker.

twooster commented 3 years ago

Is there any way to do this without falling back to Docker (which is a child_process.spawnSync call, IIRC)? This isn't appropriate to every build situation. I have all the tools to perform the build I desire inside of JS, as a local build, but cannot because the result of tryBundle is not awaited. I'm left with a manual synchronous spawn of node as my only option as far as I can tell.

eladb commented 3 years ago

Is there any way to do this without falling back to Docker (which is a child_process.spawnSync call, IIRC)? This isn't appropriate to every build situation. I have all the tools to perform the build I desire inside of JS, as a local build, but cannot because the result of tryBundle is not awaited. I'm left with a manual synchronous spawn of node as my only option as far as I can tell.

At the moment the only way would be to simply use spawnSync within tryBundle and invoke rollup as a child process.

jncornett commented 3 years ago

Hmm, what about context providers? Are you telling me that *.fromLookup(...) will also not go over the network?

njsnx commented 3 years ago

Hmm, what about context providers? Are you telling me that *.fromLookup(...) will also not go over the network?

this was my thought - I think limiting CDK to sync activities is an odd thing to impose. I have a few use cases where I want to use CDK to generate CloudFormation/Terraform based on data in an API or from another source.

For me, the main benefit of CDK is to be able to use "real code" to create infra and creating constructs that can do some of the heavy lifting for me when it comes to configuration - I can do this with Python because most of it is synchronous but was disappointed to find this being a limitation when using CDK with Typescript.

Feels like something CDK should support overall and let users decide if it's an anti-pattern or not based on their requirements, workflows and use cases.

skyrpex commented 3 years ago

I've been using async code for the CDK in typescript for a while (because I wanted to use rollup to bundle my lambdas), but in order to do so I had to stay away from the CDK-way which is defining your code using the class constructs. I used async factories instead. Just a simple example:

// Instead of the following
export class MyConstruct extends cdk.Construct {
    constructor(scope: cdk.Construct, id: string, props: MyConstructProps) {
        super(scope, id, props);
        this.lambda = new lambda.Function(this, "MyLambda", {
            code: ...
        });
    }
}

// I just did
export async function createMyConstruct(scope: cdk.Construct, id: string, props: MyConstructProps) {
    const construct = new cdk.Construct(scope, id, props);
    const lambda = new lambda.Function(construct, "MyLambda", {
        code: await bundleCodeWithRollup(...),
    });
    return {
        construct,
        lambda,
    };
}

AFAIK, there shouldn't be any technical issues ever if you do it like this.

girotomas commented 2 years ago

One of the big problems in my opinion is that the AWS SDK is asynchronous, so it's not compatible with the CDK.

mikestopcontinues commented 2 years ago

@girotomas This is my use-case exactly.

Because of AWS account quotas, I need to use sub-accounts to scale. So when I want to update my stacks, I need to check current resource usage (dynamo), perhaps create a new account (sdk), and create/update stacks across the fleet.

It's still deterministic. Given a certain state of the database, a certain output is achieved.

eladb commented 2 years ago

When we say "deterministic" in this context we mean that a commit in your repository will always produce the same CDK output. This is a common invariant for compilers and build systems and this is where the CDK tenet comes from. If you consult an external database during synthesis, this invariant may break, depending on the contents of your database.

What I would recommend to do is to write a little program/script that queries your database, creates any accounts needed and then writes some JSON file with a data model that can be read by your CDK app during synthesis. This file will be committed to source control, which means that if I clone the repo and run cdk synth I will always get the same output.

To keep this file up-to-date you can create a simple scheduled task that simply runs this script and commits the change to your repo. Very easy to do with something like GitHub workflows. This commit will trigger your CI/CD, and your CDK app will be resynthed accordingly.

mikestopcontinues commented 2 years ago

@eladb Thanks. I'll head in that direction. I found that querying and caching the data seems to be the best way to build CDK apps anyway. As I code my way through the different triggers (user signup, user domain registration/transfer, bulk stack update, etc), I'm finding the natural boundaries seem to line up nicely with the cache method. If a nightly job checks for aws account creation, it can easily also commit the metadata to the repo.

eladb commented 2 years ago

100% agree. Our experience shows that this pattern works pretty well and helps maintaining healthy architectural boundaries.

mrgrain commented 2 years ago

spawnSync doesn't easily allow for more complex invocation scenarios, like using esbuild's plugin API. We effectively force users to deal with two different entry points to their cdk app.

Aside from that, not supporting async locks out other patterns like worker-threads or wasm.

I think we are thinking too small here. There are many valid asynchronous use cases, that are still deterministic.

hariseldon78 commented 2 years ago

I am thinking about storing the git commit signature in a system parameter at every deploy, which is an async command. Do you think also this would break the cdk assumptions? I could do that with the aws cli and a post deploy hook, but that could break if some colleague don't have the cli installed, or is using some strange operating system that has no bash (cough... win cough.. dows).

rnag commented 2 years ago

I just wanted to add but my use case is very similar to as @girotomas mentioned actually. As a specific example, in my stack I set up an API gateway and secure it behind an API key, however I wanted to include the API key value in the stack outputs so that it is easier for developers to use. I realized this was not possible, so the simplest (and most cost-effective) solution was to use the AWS SDK to auto-generate an API key and store it in Parameter store. In my CDK script, I essentially have logic to either retrieve the API key value from this parameter, or else auto-generate a value and create the parameter if it doesn't exist. This allows me to retain same API key value for a stack, and also populate a stack output with the value for the API key.

The one downside is as mentioned, the AWS SDKs all seem to be asynchronous so I'd need to use the await keyword and call an async method after creating the Stack construct. For this use case the solution as suggested by @rrrix worked out great for me.

Negan1911 commented 2 years ago

@eladb I don't think that we should keep this sync, it's just nonsense, you're internally using spawnSync to execute arbitrary commands and even to do HTTP requests as the async PR says, so you already broke the "deterministic" rule.

Being deterministic or not is not a matter of being async or not, and by not being async you've already broke a bunch of valid use cases with bundlers like rollup or webpack, even if node has fs operations sync, node itself is not sync and most of those said bundlers do not export sync APIs for a reason.

As I said, it just nonsense because route53.fromLookup already requires to do some requests which means it's not deterministic, in the same way that you mention with doing HTTP requests with spawnSync, so if you are circumventing your own rules about "deterministic" builds, I don't think that keeping construct sync it's a good argument.

revmischa commented 2 years ago

I really would love to build my lambda functions in parallel because my deployments are getting very time consuming. I have an AppSync API with many resolver lambda functions.

Why can't builds be deterministic with async asset bundling? For example could not a Construct say, override some base function like override prepareAsset(): Promise<Code> => {...}, then CDK waits until all promises have succeeded before completing the build. What's not deterministic about that?

shishkin commented 2 years ago

I used async factories instead.

@skyrpex Does your approach imply that you have to stick to it consistently throughout all your CDK code? Or at least from the top entry point down to where you need to use await? And what would you suggest for when the construct class used to have a getter property exposed?

skyrpex commented 2 years ago

I used async factories instead.

@skyrpex Does your approach imply that you have to stick to it consistently throughout all your CDK code? Or at least from the top entry point down to where you need to use await? And what would you suggest for when the construct class used to have a getter property exposed?

You can use CDK classes within your async code, but can't use async code within CDK classes (ie, within class constructors). Regarding exposing things, you can choose what to return from your async factories: objects, getters, methods...

moali87 commented 2 years ago

@eladb I have to agree with @Negan1911 I'm currently looking for a way to use http requests in CDK. In my use-case, I'm using AWS network firewall, and dynamically populating the rules (one time action). The rules are IP's gathered from github meta API which returns public IP's.

eladb commented 2 years ago

@eladb I have to agree with @Negan1911 I'm currently looking for a way to use http requests in CDK. In my use-case, I'm using AWS network firewall, and dynamically populating the rules (one time action). The rules are IP's gathered from github meta API which returns public IP's.

We've actually had a similar use case in construct hub. The way we solved this is by creating a GitHub workflow which performs the HTTP request and commits a set of JSON files into our repo with the result. Then, our CDK app simply reads the files during construction (fs.readFileSync()).

This would be my recommended pattern for considering external inputs during synthesis. It maintains the invariant that says that every change in the app is always a result of a commit to the repository. There's a clear trace line between every commit and the state of the system, you can always go back to a previous version, there's an audit trail, etc.

Hope this helps.

shishkin commented 2 years ago

@eladb Thanks for posting the links to construct hub workflows. I'm not sure that level of accidental complexity and moving parts is justified to something that otherwise could be as easy as calling an async function. I would definitely disagree it being a recommended pattern, maybe a workaround at best.

Overall, the CDK's noble ideal of guaranteeing reproducibility and audit trail is not achievable in the general case. With existing sync escape hatches users are free to shoot themselves in a foot. Except they can't do it in a clean and idiomatic way. Why being so patronizing about async?

Negan1911 commented 2 years ago

@eladb I don't see why then async or not async would make the difference... having the ability to use async functions does not mean that something will be invariant or not regarding changes, it's just a different way of pulling that data.

I have another use case for you, I need to compile a lambda with webpack, and that compilation happens in an async manner, so I'm respecting the rule to be invariant about changes (anyway, that's your rule and don't see why you should enforce it to other users), but even if I'm respecting that I'm not allowed to work around it.

Or, on any other case like you did with recommending spawnSync you'll basically be allowed to break said rules without ahy consequence and cdk will support to do so.


What I'm trying to say is that having deterministic builds is not tied to support sync or async operations. And this library job is to let us code our infrastructure, not to take technical decisions for us. You're just making everyone jobs more difficult for no particular reason at all.

eladb commented 2 years ago

@Negan1911 I agree. The async constraint in JavaScript is not a fool proof way to encourage deterministic builds, and as you said, if you really need to perform network calls or other async activities in JavaScript, you can always spawn a child process synchronously to work around this limitation.

Having said that, I think this thread is a testament that this constraint is actually a good way to signal to developers that maybe they are trying to do something that contradicts some of the core assumptions of the framework.

The IP allow list above is a good example - performing an HTTP request during synthesis means that build output can change without any change to the source, but commiting a JSON file to your source control is a reasonable way to avoid this and get all the benefits of a deterministic build. The async constraint was the reason we even had that conversation :-)

Negan1911 commented 2 years ago

if you really need to perform network calls or other async activities in JavaScript, you can always spawn a child process synchronously to work around this limitation.

Why would then have to do that complex thing only to do something that can be resolved easily with using async?.

Having said that, I think this thread is a testament that this constraint is actually a good way to signal to developers that maybe they are trying to do something that contradicts some of the core assumptions of the framework.

I don't think so, this issue is the living proof of valid use cases where we don't break those rules but we need async functionality, and the answer is "screw you, implement your own child process because no particular valid reason".

The IP allow list above is a good example - performing an HTTP request during synthesis means that build output can change without any change to the source, but commiting a JSON file to your source control is a reasonable way to avoid this and get all the benefits of a deterministic build.

Yeah, only one good example, I repeated two times a good example where you need async (webpack bundling) and the builds are deterministic, and there's no way of achieving that with this CDK constrain.

The async constraint was the reason we even had that conversation :-)

Eh, yes, because there are multiple ways of circumvent it and doesn't give any particular value besides being a PITA to work with.

moali87 commented 2 years ago

We've actually had a similar use case in construct hub. The way we solved this is by creating a GitHub workflow which performs the HTTP request and commits a set of JSON files into our repo with the result. Then, our CDK app simply reads the files during construction (fs.readFileSync()).

This would be my recommended pattern for considering external inputs during synthesis. It maintains the invariant that says that every change in the app is always a result of a commit to the repository. There's a clear trace line between every commit and the state of the system, you can always go back to a previous version, there's an audit trail, etc.

Hope this helps.

To give a bit more context, I was looking to create a pattern where the user can request a domain within a given selection, and the request would pull up the public IP's. Those public IP's can then be used in a rule group(s) for a network firewall.

However, since most/all request type libraries are async, I would need to handle it in async fashion or use promise returns. Promise returns are a bit too confining since I would also need to create the resources within the promise return method (.then). This would mean I cannot have one rule group to allow multiple public endpoints.

RichiCoder1 commented 2 years ago

It's sort of worth noting that this thread has actually called out the escape hatch multiple times, it's just not exposed. Almost all the fromLookup and other context-based calls essentially shell out to some other mechanism and then store the result in context, which then can be committed to allow only the first run to be non-deterministic.

However, this is currently an internal-only mechanism. I think the ability to create user-defined lookups would scratch an itch needed by many in this thread, while also meeting the spirit the CDK team is going for here. It's also arguably more accessible and grokable then having a separate script or job that generates this contextual information.

revmischa commented 2 years ago

I just would like to build my large collection of lambda functions in parallel. However we get there is an implementation detail, I suggested one possible implementation above:

Why can't builds be deterministic with async asset bundling? For example could not a Construct say, override some base function like override prepareAsset(): Promise<Code> => {...}, then CDK waits until all promises have succeeded before completing the build. What's not deterministic about that?

DJAlPee commented 2 years ago

I ran into this issue, when using CDK for Terraform, so I assume there is some kind of relation...

In Terraform there is the concept of "data sources", which are getting data, that is used during the rollout phase of the infrastructure. It is kind of compareable to use a reference to SSM/SecretsManager in CloudFormation. Even when the synthesized template (CloudFormation / Terraform) would be created deterministically, the rolled out infrastructure could look different!

In our Company we have a similar discussion in a different area: Build our frontends before each deployment or build the frontends once and (re-)use the generated artifact for the deployment. Because of lock files and our private NPM cache/repository, our builds are so "deterministic", that there is no difference in comparison to use a pre-build artifact instead.

For CDK, I would assume the synthesized templates are the "artifacts". When storing these artifacts in some store, you will also get reproducible deployments. In the very end, the deployment itself has to be deterministic and there are multiple ways to achieve this. A deterministic build/synth is one possible solution, doesn't has to be the only one...

Just my 2 cents... sorry for that 😄

sam-goodwin commented 2 years ago

Deterministic asynchronous code is possible. Non-deterministic synchronous code is possible. This argument doesn't make sense imo.

biinniit commented 2 years ago

Math.random()

delprofundo commented 1 year ago

I think we can all agree we are all worse off for this discussion. No one more than the cdk itself.

revmischa commented 1 year ago

I'd like to point out that Serverless Stack has supported CDK parallel builds and async constructs (well, stacks) for a few versions now: https://github.com/serverless-stack/sst/releases/tag/v1.11.1

steffenstolze commented 1 year ago

I was facing a problem where I needed to retain an API Gateway custom domain since I didn't want to configure the DNS records of the not AWS managed domain whenever I redeploy the CDK app.

Since CDK gave me a "Domain name already exists" error when I wanted to redeploy the stack I had the choice

import { Stack, StackProps, Duration, CfnOutput, RemovalPolicy } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { Certificate } from 'aws-cdk-lib/aws-certificatemanager';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import { Runtime } from 'aws-cdk-lib/aws-lambda';
import * as path from 'path';
import { EndpointType } from 'aws-cdk-lib/aws-apigateway';
import * as sdk from 'aws-sdk';

export class PlatformApiStack extends Stack {
    constructor(scope: Construct, id: string, props?: StackProps) {
        super(scope, id, props);

    // Wrapping the whole stack in an async function is considered anti-pattern.
    // The alternative would be a Lambda backed CustomResource where you'd have to deal with resource creation
    // using AWS SDK functions. Since this solution here works fine, why bother with the overhead.
        (async () => {
            // API Gateway
            const api = new apigateway.RestApi(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_api`, {
                description: 'Customer API',
                deployOptions: {
                    stageName: `${process.env.STACK_ENV}`
                },
                endpointConfiguration: { types: [EndpointType.REGIONAL] },
                // enable CORS (TODO: harden it later)
                defaultCorsPreflightOptions: {
                    allowHeaders: ['Content-Type', 'X-Amz-Date', 'Authorization', 'X-Api-Key'],
                    allowMethods: ['OPTIONS', 'GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
                    allowCredentials: true,
                    allowOrigins: ['*']
                }
            });

            let apiGatewayDomainExists = false;
            const sdk_apigw = new sdk.APIGateway();
            try {
                await sdk_apigw.getDomainName({ domainName: `${process.env.DOMAIN}` }).promise();
                                sdk_apigw.createDomainName()
                apiGatewayDomainExists = true;
                                console.log(`API Gateway custom domain "${process.env.DOMAIN}" does exist and will NOT be created.`);
            } catch (error) {
                console.log(`API Gateway custom domain "${process.env.DOMAIN}" does not exist and will be created.`);
            }

            if (!apiGatewayDomainExists) {
                const domainName = new apigateway.DomainName(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_domain`, {
                    domainName: `${process.env.DOMAIN}`,
                    certificate: Certificate.fromCertificateArn(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_cert`, `${process.env.AWS_ACM_CERT_ARN}`),
                    endpointType: EndpointType.REGIONAL,
                    mapping: api
                });
                domainName.applyRemovalPolicy(RemovalPolicy.RETAIN);
                new CfnOutput(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_api_gateway_domain_name`, { value: domainName.domainNameAliasDomainName });
            }

            //.... all the other stuff
        })();
    }
}

This works as expected and only creates the DomainName resource if it doesn't exist already.

Why would I bother using a CustomResource for this and are there other ways to achieve this goal?

Thanks 🙂

adam-nielsen commented 1 year ago

I just ran into an issue where I was using Fn.importValue() to import a value from another stack's output, but I didn't realise this meant the source stack would be frozen and could no longer be updated if that output ever needs to change. I don't really understand the point of this (why bother exporting the value if you can't ever change it) but anyway one workaround from #17475 apparently is to call the AWS SDK to load the stack's output value, then pass that through as a string. That way you get the output value without freezing the source stack.

Of course that involves awaiting on an SDK call while the stack is being constructed, which is not possible and led me here. In the end followed the above example's recommendation and put all the values I needed into the stack's props, requiring the caller to look up the right values and pass them in to the Stack instance constructor. Since I could do async calls in CDK's bin/*.ts file, I did the lookups there and passed the values into the Stack through the props. Definitely less than ideal but it does work for simple use cases like mine, and I only mention it as one further reason for needing to call async functions in a CDK stack - to work around CloudFormation limitations.

erhhung commented 1 year ago

Having read all the use cases mentioned in this thread, I have another use case that required me to want to await an async operation: dynamically importing arbitrary TypeScript modules containing Construct classes based on some context value passed to cdk synth, with which the main stack will instantiate those dynamically imported Construct classes to augment the stack.

Those dynamically imported modules are committed to Git but in separate projects: in fact, I have a generic, reusable CDK project that creates a sizable "standard" stack, and gets included as a Git submodule in other projects, some of which need to add minor extensions to that standard stack, hence needing to add some unique CDK code that have no business being included in the generic CDK project just so it can be conditionally invoked based on a config value.

I think my use case is "deterministic" in every sense as all code is committed in Git. While I could publish the standard stack as an NPM package, but then I'd have to have a full CDK project with all its extra "boilerplate" in all my concrete projects instead of just adding a submodule and occassionally providing a single .ts file to be dynamically imported.

Kilowhisky commented 1 year ago

One could argue that defining an entire stack inside a constructor is an anti-pattern. Constructors are for building bare resources or dependency injection, not complex if/else logic or other things we are forced to do.

anyways, something as simple as async construct() would make so many lives easier and moreover be less of an anti-pattern than it is currently.

If we have async ability we can do cool things like building libs/deps or projects on demand as opposed to right now where i have to remember to build my project (.net) first before i can deploy it.

For example

function buildDotNetRelease(csproj: string) {
    return new Promise((resolve, reject) => {
        const publish = exec(`dotnet publish ${csproj} -c Release`);

        publish.on("exit", x => resolve(x));
        publish.on("close", x => resolve(x));
        publish.on("disconnect", () => reject());
        publish.on('error', x => reject(x));
    })
}
adam-nielsen commented 1 year ago

@Kilowhisky CDK can do what you want, but you approach it differently. You need to use something like DockerImageAsset to have your project built inside Docker, and then that is what gets deployed.

Async would definitely be good but I can see their point. They are forced to work within the limitations of CloudFormation. It would be better in many ways to scrap CloudFormation and have CDK issue all the API calls directly, as that would allow working around many of CloudFormation's limitations.

At the moment a suitable workaround is putting your async code in the bin/ file, and then passing the result down to the stack via the props. But for actually building projects, there are already constructs within CDK that can achieve it without async as it is a common requirement for most projects.

DJAlPee commented 1 year ago

[...] It would be better in many ways to scrap CloudFormation and have CDK issue all the API calls directly, as that would allow working around many of CloudFormation's limitations.

Have a look at CDK for Terraform 😉 Unfortunately, it has the same limitation regarding (a)sync...

At the time CDK was "unique" (CloudFormation only), this archtitectural decision had not that big impact like nowadays with all these new "targets" (Terraform, Kubernetes and even Repositories with Projen).

Samrose-Ahmed commented 1 year ago

You can't even generate a zip file in NodeJS synchronously, this is a ridiculous limitation!

rrrix commented 1 year ago

I've been reading emails for this thread for nearly 3 years. So before I unsubscribe, I thought I'd share my thoughts and research on the topic, given that this issue was closed with failed reasoning (logical fallacy):

We currently do not support this and in general we consider this an anti-pattern. One of the tenets of CDK apps is that given the same source they will always produce the same output (same as a compiler). If you need to perform async operations, it means you are going to the network to consult with an external entity, which by definition means you lose determinism.

(Emphasis mine).

Unfortunately this logic fails even the most trivial of thought experiments. The failure here is assuming asynchronous software is by definition non-deterministic, synchronous software is by definition deterministic, and that the only use case for asynchronous software is networking or database queries or other non-deterministic tasks. Software concurrency is unrelated to the determinism of the resulting output of an application. If you concurrently add up a sum of N numbers, the result is the same as if you did it synchronously. If you synchronously add up the sum of R random numbers (generated synchronously), the output is still random.

If deterministic concurrency was impossible, languages such as Go (with goroutines) and the async/await programming paradigm would not exist. Modern software can have both determinism and concurrency! 🍰

Trying to protect your users from themselves and forcing unpopular arbitrary rules that fit an outdated/misinformed perspective will always result in your users finding a different (and often better) way to do what they want, how they want to do it - usually via your competitors.

I was an early adopter and promoter of the AWS CDK. Sadly, this is one of several unfortunate architectural design decisions which has resulted in my company (and myself) abandoning the AWS CDK entirely. I'm sorry if it sounds harsh, but I believe honest feedback is important.

steffenstolze commented 1 year ago

@rrrix I also thought AWS CDK would be the tool. More and more I discovered that there are so many limitations - often due to CloudFormation and I didn't want to always create custom resources and write SDK code. Recent problem has been creating multiple GSIs on a DynamoDB table. Not possible with CDK.

In the end I've found that Terraform, due to it's under the hood usage of AWS SDK, get's things done easier and quicker for my use cases.

DJAlPee commented 1 year ago

I was an early adopter and promoter of the AWS CDK. Sadly, this is one of several unfortunate architectural design decisions which has resulted in my company (and myself) abandoning the AWS CDK entirely. I'm sorry if it sounds harsh, but I believe honest feedback is important.

@rrrix Thanks for your clear and on-point statement! In which directions are you (and/or your company) looking now?