sam-goodwin / punchcard

Type-safe AWS infrastructure.
Apache License 2.0
506 stars 20 forks source link

build-time transient modules #113

Closed sam-goodwin closed 4 years ago

sam-goodwin commented 4 years ago

Hooray! This change effectively removes the entire CDK framework from the bundle after webpack. I also removed our dependency on moment-js as it was contributing 500KB (WTF) to the bundle. The impact is drastic - a punchcard bundle is down from 2MB to 100KB (32KB compressed) when using production mode, and ~600KB (175KB compressed) in development mode. This should make Punchcard's impact on cold start negligent :) @birowsky, i hope this is looking better! It should also support configuration of webpack on a per-function basis.

Typescript 3.8 introduced a new feature: type-only imports that allows developers to explicitly declare an import as types only, meaning it is entirely erased after compilation. This was the final piece of the puzzle required to support totally removing CDK dependencies from the global import scope, allowing webpack to erase them during tree-shaking.

CDK dependencies are still dependencies (not devDependencies), but now they are only allowd to be imported as types. To actually use the CDK, you must "map into" the global CDK Build context. It is forbidden to import these libraries outside of a Build scope. A custom linter @punchcard/linter (with rule name: punchcard-transient-imports) detects and auto-fixes this!

rulesDirectory:
  - '@punchcard/linter'
rules:
  punchcard-transient-imports: true

Below is a copy of CDK which lazily exposes the CDK libraries encapsulated in a Build context.

import { Build } from './build';

/**
 * Encapsulate the entire AWS CDK in a `Build` context so that it can be detached from
 * the runtime bundle.
 *
 * Users of this class should ALWAYS import CDK types as type-only or else module load errors
 * will be thrown at runtime.
 *
 * E.g.
 * ```ts
 * import type * as cdk from '@aws-cdk/core`;
 * import type * as lambda from '@aws-cdk/aws-lambda`;
 *
 * // instead of: import * as cdk from '@aws-cdk/core`;
 *
 * // then, access the CDK via the global Build<CloudDevelopmentKit>
 * CDK.chain(({core, lambda}) => app.map(app => {
 *   const stack: cdk.Stack = new core.Stack(app, 'my-stack');
 *
 *   const fn: lambda.Function = new lambda.Function(stack, 'MyFunc', { .. });
 * });
 * ```
 *
 * This is so the CDK infrastructure code can be erased from the runtime bundle with webpack,
 * drastically reducing the impact of the Punchcard framework on the cold-start.
 */
export const CDK: Build<CDK> = Build.lazy(() => new (CloudDevelopmentKit as any)() as CDK);

export interface CDK extends CloudDevelopmentKit {}

/**
 * Use of this class shouldbe restricted to within a `Build` context, by mapping into the global CDK context.
 *
 * ```ts
 * CDK.map(({lambda, cdk}) => {
 *   // write CDK code
 * })
 * ```
 */
export class CloudDevelopmentKit {
  private constructor() {}

  public get apigateway(): typeof import('@aws-cdk/aws-apigateway') { return require('@aws-cdk/aws-apigateway'); }
  public get core(): typeof import('@aws-cdk/core') { return require('@aws-cdk/core'); }
  public get dynamodb(): typeof import('@aws-cdk/aws-dynamodb') { return require('@aws-cdk/aws-dynamodb'); }
  public get events(): typeof import('@aws-cdk/aws-events') { return require('@aws-cdk/aws-events'); }
  public get eventsTargets(): typeof import('@aws-cdk/aws-events-targets') { return require('@aws-cdk/aws-events-targets'); }
  public get glue(): typeof import('@aws-cdk/aws-glue') { return require('@aws-cdk/aws-glue'); }
  public get iam(): typeof import('@aws-cdk/aws-iam') { return require('@aws-cdk/aws-iam'); }
  public get kinesis(): typeof import('@aws-cdk/aws-kinesis') { return require('@aws-cdk/aws-kinesis'); }
  public get kms(): typeof import('@aws-cdk/aws-kms') { return require('@aws-cdk/aws-kms'); }
  public get lambda(): typeof import('@aws-cdk/aws-lambda') { return require('@aws-cdk/aws-lambda'); }
  public get lambdaEventSources(): typeof import('@aws-cdk/aws-lambda-event-sources') { return require('@aws-cdk/aws-lambda-event-sources'); }
  public get logs(): typeof import('@aws-cdk/aws-logs') { return require('@aws-cdk/aws-logs'); }
  public get s3(): typeof import('@aws-cdk/aws-s3') { return require('@aws-cdk/aws-s3'); }
  public get sns(): typeof import('@aws-cdk/aws-sns') { return require('@aws-cdk/aws-sns'); }
  public get sqs(): typeof import('@aws-cdk/aws-sqs') { return require('@aws-cdk/aws-sqs'); }
}

I have also introduce a new, very simple module: @punchcard/erasure. It simply stands as a global hook for these new "transient" modules to declare themself when imported. E.g. @punchcard/constructs contains CDK code, so its index.ts marks itself for erasure.

export * from './delivery-stream';

import erasure = require('@punchcard/erasure');

// tell Punchcard to erase this module from the runtime bundle - it is only needed at build time.
erasure.erasePattern(/^@punchcard\/constructs$/);

Punchcard will then configure a webpack.IgnorePlugin for each globally registered regex, effectively erasing it from the runtime bundle. This self-declaration should create a hands-off experience for developers depending on "transient" packages.

sam-goodwin commented 4 years ago

In hindsight, the erasure library is wrong. Should instead add a flag to package.json so the the linter can also detect which packages to error on. Achieves the same result. Duh.

eladb commented 4 years ago

Wow! Love it!

Birowsky commented 4 years ago

Awesome job! Congrats on that win!

So, if I understood correctly, the CDK is gone from the lambda bundles. How about Punchcard? Could you explain in short what sort of overhead is being done by punchcard on a cold start?

Of course, I only see six extra calls here before we get to calling the lambda itself:

CDK.chain(({core, lambda}) => app.map(app => {
  const stack: cdk.Stack = new core.Stack(app, 'my-stack');
  const fn: lambda.Function = new lambda.Function(stack, 'MyFunc', { .. });
});

But, is there anything significant behind them? (chain, map, Function) Or are they stubbed away when compiled? Or are they being called at all?

Thanx! Rock on!

sam-goodwin commented 4 years ago

Punchcard is still in there and any code that manipulates/creates constructs but it's pretty insignificant and doesn't actually run (it just contributes to the code size). E.g. this code below is still in the webpacked bundle but it never runs:

scope.map(scope => new dynamodb.Table(scope, 'id', { .. })

The CDK dependencies (@aws-cdk/aws-dynamodb and @aws-cdk/core) are entirely removed from the bundle and don't contribute to the size or cold start. I'm not sure I can remove more without getting deeper into the TSC compiler or Webpack code. I could do something really nasty like how Pulumi serializes a closure, but that has other downsides like destroying stack traces and limiting syntax.

Punchcard's impact on cold start is limited to just requiring your module, so any work that is done by requiring your index.js will be ran on cold start - instantiating some classes and creating some shapes, for example. Those classes don't do anything expensive though since most of the heavy lifting is suspended within a Build lazy context and erased. It does add some size, but I think a 32KB-160KB zip file is manageable for most cases. Whenever it is unacceptable, Punchcard does not stop you from "dropping down" and using the vanilla CDK. I expect the cost will be around the same but it'd be fun to do some testing and see just how far away from ideal we are.