aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.71k stars 3.94k forks source link

L1s: properties are never typed anymore after being emitted as `any` once #32285

Open rix0rrr opened 4 days ago

rix0rrr commented 4 days ago

The problem

Casing

Before we get into the issue, you first need to know the main difference between any-typed JSON data, and property typed data: casing!

All properties in CloudFormation are PascalCase, but in most programming languages identifiers have different casing: camelCasing in TypeScript, snake_casing in Python, etc. To make identifiers look natural, CDK translates the casing of identifiers when generating classes:

CloudFormation:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: 'my-bucket'  # <-- Notice the capital
new Bucket(this, 'MyBucket', {
  bucketName: 'my-bucket'   // <-- Notice the lowercase
});

Recasing rules are complex enough that we can't consistently recase without prior knowledge: we need to know about exceptions like MBs, IP, some services actually use camelCase properties in CloudFormation as well, etc. That means that recasing rules are computed at L1 code generation time; they cannot be reliably done at runtime.

This is all great, and we can do this if we know about the CloudFormation properties. However, sometimes properties in CloudFormation are typed as "arbitrary data" (colloquially called "JSON", even though it isn't really JSON because JSON is a string encoding 🤓). If the type in the schema is JSON, then we can't know the properties inside, and so we can't recase them.

So if we have:

interface CfnLongJobProps {
  /**
   * The type of this property was: 
   *
   *    { TimeoutConfig: { Type: 'JSON' } }
   *
   * In the schema.
   */
  timeoutConfig?: any;
}

new CfnLongJob(this, 'LongJob', {
  timeoutConfig: {
    minutes: 15,      // Should this be emitted as `Minutes: 15` or `minutes: 15` ? We can't know!
  }
});

The upshot is that if a property is typed as any, we just have to emit it as-is, and we have to leave the casing up to the customer.

So they have to write:

new CfnLongJob(this, 'LongJob', {
  timeoutConfig: {
    Minutes: 15,      // Bit jarring dissonance between camelCase and PascalCase, but that's what it is

    // Also we have the opportunity to make typos, but that's what you get with a JSON type
  }
});

Late typing additions

In the examples that follow I will be using a pseudo-version of CloudFormation Specification instead of the newer CloudFormation Registry Schema, simply because it's easier to write and read. The Specification is deprecated and we will be using the Schema going forward, but the same discussion applies in the exact same way, just in a different schema language.

Everything works, but now comes a twist: a service team notices that they are typing a property as "JSON", and decide they can do better: they know what properties that field accepts, and they decide to add a type for it to the schema to help customers catch typing mistakes. So they change the schema:

// Old
{
  Resources: {
    LongJob: {
      Properties: {
        TimeoutConfig: { Type: 'Json' }
      }
    }
  }
}

// New
{
  PropertyTypes: {
    TimeoutConfig: {
      Properties: {
        Minutes: { Type: 'Number' }
      }
    }
  },
  Resources: {
    LongJob: {
      Properties: {
        TimeoutConfig: { Type: 'TimeoutConfig' }  // <-- this changed
      }
    }
  }
}

This should be strictly better for customers: data that used to be valid is still valid, and data that used to be incorrect is now caught.

Except the first part of that (data that used to be valid is still valid) is true for CFN templates but not for CDK Code, and the reason is recasing.

Recall, a customer had written the following to correctly emit the Minutes field:

new CfnLongJob(this, 'LongJob', {
  timeoutConfig: {
    Minutes: 15,
  }
});

But if we added the field typing, the underlying class definition would now look like this:

interface TimeoutConfigProperty {
  minutes?: number;
}
interface CfnLongJobProps {
  timeoutConfig?: TimeoutConfigProperty;
}

new CfnLongJob(this, 'LongJob', {
  timeoutConfig: {
    Minutes: 15,  // <-- No such property! Did you mean 'minutes' ?
  }
});

Adding types after the fact would break existing user code!

The current state

Which is why we don't do it. If a type has ever been typed as JSON, and therefore been emitted as any, it must forever be emitted as any, and CDK customers cannot get the benefit of the added types!

That's obviously not ideal, and regularly causes customer confusion.

The (proposed) solution

Even though we can't change the types of fields we've emitted in a backwards breaking way, there is another avenue available to us: adding new fields and deprecating old ones.

Once the additional types of LongJob become available to us, we could start emitting its properties like this:

interface CfnLongJobProps {

  /** 
   * @deprecated Use `timeoutConfigV2` instead.
   * @default - At most one of timeoutConfig and timeoutConfigV2 may be used
   */
  timeoutConfig?: any;

  /**
   * @default - At most one of timeoutConfig and timeoutConfigV2 may be used
   */
  timeoutConfigV2?: TimeoutConfigProperty;
}

That way, we both don't break existing code, as well as give new users the opportunity to take advantage of the better types.

The mechanism to get here

The awscdk-service-spec has a PreviousTypes field that can be used for this. For properties whose types have gone through an evolution, it will have a history. The service spec data will look roughly like this:

// Not exact data model, but to give you an idea
{
  Resources: {
    LongJob: {
      Properties: {
        TimeoutConfig: { 
          Type: 'TimeoutConfig',
          PreviousTypes: ['Json'],
        } 
      }
    }
  }
}

What the L1 generator currently does is always use the very oldest PreviousType to generate a field's property... but we could also be generating a field for every type.

Those fields would obviously have to be mutually exclusive, and at rendering time we would pick the one that's supplied (if any), in code that looks somewhat like this:

// This code is copy/pasted from a different example that I had prepared. Sorry 😅 
function DashboardPropsToCloudFormation(props: DashboardProps) {
  if ((props.dashboardDefinition !== undefined) == (props.dashboardDefinitionV2 !== undefined)) {
    throw new Error('Supply exactly one of dashboardDefinition and dashboardDefinitionV2');
  }

  return {
    DashboardDefinition: props.dashboardDefinitionV2
      ? MyDashboardTypeToCloudFormation(props.dashboardDefinitionV2)
      : dashboardDefinition,
  };
}

There might be some subtle details to keep in mind while rolling this out. For example, a required field becomes optional, and we have to validate using runtime checks that exactly one (instead of at most one) of all the variants is supplied. Those issues should become self-evident once we start implementing this.

rix0rrr commented 4 days ago

Examples of problem reports caused by this issue:

mrgrain commented 4 days ago

There are also some options to make the original type easier to use for at least some of our users.

  1. Turn the emitted type into a union: any | TimeoutConfigProperty. This doesn't really help with types, but at least indicates to the user what type to use. We can do this, because in jsii languages a union type is converted to any equivalent anyway.
  2. Irregardless of the first point, some users are already using types like TimeoutConfigProperty to type these untyped properties. Users do this without being explicitly told, to improve their developer experience. Unfortunately the use of types causes a casing issues: CDK interfaces are generally camelCase'd, however CFN usually expects PascalCase'd property names. This is solved for correctly typed props as the CDK handles this conversion for the user. With untyped properties this currently doesn't happen - but it could! We already know what these props should look like for CFN. So we can opportunistically apply the case conversion for the user for all property names that the system is now aware of. This is maybe not strictly correct as we change the user's input despite the exposed type being any / JSON - but I'd wager that in 99% case it's what a user wants.
  3. An extension to the second point, we could just force PascalCase on every property name everywhere. CFN usually wants PascalCase anyway, but I don't know if there might be some resource that have different casing hidden in the depths of their types. This might be the better default though and we could than maintain an explicit list for all other cases.