tc39 / proposal-type-annotations

ECMAScript proposal for type syntax that is erased - Stage 1
https://tc39.es/proposal-type-annotations/
4.13k stars 44 forks source link

An invitation to reconsider runtime semantics #173

Open theScottyJam opened 1 year ago

theScottyJam commented 1 year ago

In a recent meeting, the TC39 delegates got together to discuss whether or not the built-in type system should have runtime semantics or not. During the course of that meeting a handful of arguments were brought up against runtime semantics, they discussed them around a bit, then seemed to come to a general agreement (or, at least there wasn't any disagreement brought forwards) to just focus on type erasure and not bother with runtime semantics.

So I'd like to bring forth my disagreement :).

Here are the arguments against runtime semantics: 1: performance 2: It wouldn't be practical to allow multiple type systems to use the type space, you would effectively have to only allow one. 3 (A): they argue that a sound type system is impractical - there's just no great way to handle mutability (I agree, I previously gave a stab at doing this, found that it's impractical to do a fully sound type system, and even a mostly sound one is still a lot of work) 3 (B): and if we don't care about sound type-checking, there's better proposals to handle the runtime checks. 4: Giving TypeScript-looking types some runtime semantics could be surprising to the end-user.

These points are all true if we look at this from an all-or-nothing approach, i.e. if we try to make all type annotations have runtime semantics, then yes, we'll certainly run into performance and other issues. But, I'm going to argue that there's a happy middle ground that we ought to consider.

We need to start out by understanding that, in a well-written library, some degree of runtime type checks do happen. They just happen by hand, very tediously. In general, any time an argument is passed into such a library, some hand-coded JavaScript will execute to ensure that everything looks correct. Say, for example, we have a library written in TypeScript that deals with graphing, and we have this function that adds two coordinates together.

interface Coordinate {
  x: number;
  y: number;
}

export function addCoords(coord1: Coordinate, coord2: Coordinate): Coordinate {
  if (coord1 !== Object(coord1) || typeof coord1.x !== 'number' || typeof coord1.y !== 'number') {
    throw new TypeError(...);
  }
  if (coord2 !== Object(coord2) || typeof coord2.x !== 'number' || typeof coord2.y !== 'number') {
    throw new TypeError(...);
  }

  return {
    x: coord1.x + coord2.x,
    y: coord1.y + coord2.y,
  }
}

If we convert the above example to follow this proposal's current form, then not much will change about this example. However, it's very unfortunate that we're having to manually handle type-checking of these coordinate objects when the JavaScript engine has type information right in front of it.

Because of this, I'd like to propose some ways to opt into runtime semantics. First, a conformsTo keyword, that can be used like this:

export function addCoords(coord1: Coordinate, coord2: Coordinate): Coordinate {
  if (!(coord1 conformsTo Coordinate)) {
    throw new TypeError(...);
  }
  if (!(coord2 conformsTo Coordinate)) {
    throw new TypeError(...);
  }

  return {
    x: coord1.x + coord2.x,
    y: coord1.y + coord2.y,
  }
}

This keyword grants the library author power to actually apply runtime semantics, greatly simplifying their code. What's more, it grants them power to do what they want if the passed-in values don't conform to the requested type. Often this would mean throwing a TypeError, but if you're, say, writing functions in a style that's similar to the standard library, then perhaps you'd just coerce the bad argument instead.

Next, I'd like to introduce an assert keyword shorthand. When you use this keyword, it means you'd like to automatically apply runtime type-checking to the argument, forcing it to conform to the indicated interface.

export function addCoords(coord1: assert Coordinate, coord2: assert Coordinate): Coordinate {
  return {
    x: coord1.x + coord2.x,
    y: coord1.y + coord2.y,
  }
}

addCoordinate(2, 3); // Build-time type error!
addCoordinate(2 as any, 3 as any); // Runtime type error!

That's really it. Keep in mind that this is mostly just a proof-of-concept - there's certainly other ways we could look at introducing opt-in runtime semantics, but I believe something this minimal would already be extremely powerful, and much appreciated.

Let's see how it measures up to the original concerns.

  1. Performance.

There is zero additional performance overhead because the runtime semantics are opt-in. It'll only go picking through every property of your passed-in object if you explicitly ask JavaScript to do so. This is no less performant than manually doing it yourself. In fact, we might see a slight improvement in performance if the engine is able to do the type-checking faster than hand-written code can do it.

  1. It wouldn't be practical to allow multiple type systems to use the type space, you would effectively have to only allow one.

Correct! In fact I'm surprised this argument is even being brought up as something bad. If one of their goals really is to try and unify the community (which they often state it is), then it makes complete sense to consider unifying onto a single type system, instead of trying to accommodate all type systems.

The idea is that TC39 would have to work together to build a new standard document - one that standardizes how tooling should behave when they read JS types. Then, anyone can build a build-time tool that conforms to these standards. In other words, we won't be bringing in TypeScript or Flow or any other existing type engine to JavaScript, rather, we'd be inventing a new one, and then unifying the community on that.

  1. If we don't care about sound type-checking (which isn't very practical), there's better proposals to handle the runtime checks.

In the presentation, they proved that you could get more ergonomic runtime semantics by adding a couple of features to other proposals. But, I don't feel like they really showed why adding features to other proposals is better than adding features to this proposal. Why not just give this proposal some teeth, and allow it to handle the runtime checks? That would be the optimal solution, and would avoid the issue of having to basically duplicate half of your type definitions - once using the type semantics provided by this proposal, and another time via some other proposal's runtime syntax.

  1. Giving TypeScript-looking types some runtime semantics could be surprising to the end-user.

I believe the primary concern here, was that having : number be a simple type hint in TypeScript, but something with runtime semantics in JavaScript would be confusing to TypeScript users. Which, sure, maybe. But that's been avoided entirely with the opt-in syntax I showed above - the only time runtime semantics come into play is when you utilize specific keywords (asserts or conformsTo) that enable it. By default, : number behaves the same, both with JS types and TypeScript.

So, in summary, I agree with the presenters that going all out with having runtime type-checks every time you have a type annotation would probably not work. But, that doesn't mean we should ignore runtime checks altogether, there's still a place for them as long as you're required to opt into having them.

ljharb commented 1 year ago

Conceptually i think you’re describing how pattern matching might work in function arguments.

theScottyJam commented 1 year ago

The convenience I would like to have is to be able to use type-definitions I already have on hand to make runtime assertions happen. With a pattern-matching, I'd basically be duplicating the interface - one version would be for type-check-time, and the other would be a pattern for runtime.

Lets assume pattern-matching did provide this kind of feature, and lets assume the syntax was <paramName>: <type> match <pattern>. We'd have something like this:

interface Coordinate {
  x: number;
  y: number;
}

function addCoords(
  coord1: Coordinate match { x: ${Number}, y: ${Number} },
  coord2: Coordinate match { x: ${Number}, y: ${Number} },
) {
  ...
}

Here, I've got the interface duplicated three times effectively. Perhaps some shorthand could be provided for pattern-matching to allow me to easily extract a pattern and make it reusable, thus only requiring a single duplication. But, all I'm trying to do here is make the language enforce that the arguments line up to the declared type. The most direct way would be to use the types themselves.

theScottyJam commented 1 year ago

Heh. It's unconventional, but you could theoretically unify the pattern-matching syntax and the type system syntax. A pattern would be the same as a type definition (and can even use values from the type namespace), the only different is, when you're defining a pattern, you're allowed to introduce new bindings and interpolate custom matchers.

interface Coordinate {
  x: number
  y: number
}

function fn(value: Coordinate | null) {
  match (value) {
    // Everything below, except the `const myY` and `${...}` would be a valid type definition.
    when ({ x: 0, y: const myY }): ...
    when ({ x: 1, y: ${MyCustomMatcher} }): ...
    when (Coordinate): ...
    when (null): ...
  }
}

i.e. pattern matching would be a super set of type syntax, instead of being a super set of destructuring syntax.

Thus, pattern matching could be a way of giving the type system runtime semantics, if we engineer it as such :).

There's a number of different angles we'd have to go over to figure out how something like this would work in detail, but still, it's an interesting thought. It would be pretty powerful to be able to just throw types into the middle of a pattern-match, and it just works.

Abdelaziz18003 commented 1 year ago

I totally agree to the idea of reconsidering runtime semantics. The proposition of @theScottyJam makes a lot of sense and addresses the issues presented by the authors very well.

Another benefit of reconsidering runtime semantics is allowing Javascript programmers to take advantage of the proposed type system without the need to use a build system or a sophisticated IDE. With no runtime semantics, there always will be a need to use a build system or an IDE that understands the type syntax to be able to take advantage of the type definitions. I think that designing a language feature to be used exclusively by third party tools, without the language itself being able to take advantage of it doesn't make a lot of sense to me.

I liked the idea of using a keyword ex: conformsTo to check for variable types whenever it makes sense instead of doing a manual check and repeating the type definition each time. However, I think we can do better by introducing a special syntax to activate runtime check in functions definitions to remove the redundancy of having the type name in both function definition and body, once to satisfy the type checker and the other to check the type at runtime. Here are some examples:

interface Coordinate {
  x: number
  y: number
}

// instead of
function fn(value: Coordinate) {
  match (value) {
    when (Coordinate): ...
    when (null): ...
  }
}

// or

function fn(value: Coordinate) {
  if (value conformsTo Coordinate) {
    // do something
  }
}

// in the above examples "Coordinate" is repeated twice
// maybe better to use "::" or alternative syntax to perform runtime check without
// the need to call `conformsTo` keyword or to use the pattern matching syntax

// fully static type definition
function fn(value: Coordinate) {
    // do something
}

// static + runtime type check (throws TypeError is the value type is wrong)
function fn(value:: Coordinate) {
    // do something
}

the choice of "::" is for demonstration purposes only, better operators might be proposed.

theScottyJam commented 1 year ago

Would the "::" be the same as the "asserts" type-operator I presented? (I assume yes, and that you had probably just missed that in my long-winded explanation?)

Abdelaziz18003 commented 1 year ago

To some extent, yes that's true. However, as far as I understood from your explanation above, "asserts" and conformsTo are proposed to only be used in expressions like the already existing keyword instanceof (to be used inside or outside functions, which is a really nice keyword to have). So they will be returning boolean values that can be checked with an if statement. I assumed that they are not proposed to be used in function definitions too because I didn't see examples about that. If that's the case, we will be faced with a situation where we need to repeat the Type name twice, once in the definition for static check and the other in the function body to check it at runtime. The proposed double colon syntax "::" is only valid in funtion definitions and it is a syntactic sugar somehow equivalent to fn (value conformsTo Coordinate), which enables the runtime check without the need to repeat the type name twice, in definition and in function body. However, please let me know if I am missing something.

theScottyJam commented 1 year ago

That was this example :)

export function addCoords(coord1: assert Coordinate, coord2: assert Coordinate): Coordinate {
  return {
    x: coord1.x + coord2.x,
    y: coord1.y + coord2.y,
  }
}

addCoordinate(2, 3); // Build-time type error!
addCoordinate(2 as any, 3 as any); // Runtime type error!

asserts was intended to be used in a function parameter list as a shorthand to reduce boilerplate, and potentially allow the engines to create detailed error messages about why the matching failed., so yeah, I believe it would be the same as ::.

Abdelaziz18003 commented 1 year ago

I can't beleive how I did miss that section! that is the same problem :: syntax was proposed to solve. I saw that you already addressed it in your proposal. Thanks for that clarification.

GrantGryczan commented 1 year ago

Excellent idea! But I'm concerned that this isn't in this proposal's scope, and if it were then it might make this proposal more difficult to accept. Or if this were a separate proposal, then I'm not sure it'd be backwards-compatible with this one since it would require type annotation syntax to actually be correct, whereas this proposal allows for type syntax that might otherwise throw an error (e.g. a type with a reference error, or a type with excessive recursion).

Though I think matches would probably be more concise than conformsTo (especially since I've never seen a JS operator be more than one word). Or alternatively add a match statement as others here have suggested.

ljharb commented 1 year ago

@GrantGryczan instanceof, typeof?

GrantGryczan commented 1 year ago

Ah, sorry, I meant a JS operator that uses camel case. I subconsciously mixed those up. Either way, matches would likely be more concise. But no matter what it's called, I'd love to see this in the language!

nguyenhothanhtam0709 commented 10 months ago

Can we make runtime semantic optional?

egasimus commented 10 months ago

Can we make runtime semantic optional?

This is what the thread all about :-)

egasimus commented 10 months ago

Good proposal, validation of arguments is a great use case!

Won't introducing a new keyword clash with existing code which is currently using it as an identifier? Is there an existing keyword which can be reused? Or is this not really a problem in because the context of type annotation is distinct enough?

spenserblack commented 10 months ago

I'm curious how strict you're thinking conformsTo and assert should be? For example, would { x: 1, y: 2, color: 'green' } pass the conformsTo/assert for the Coordinate interface, or would it fail due to not being exactly { x: number, y: number }? Another scenario is a value that has the same properties as the required type, but a difference class.

class Coordinate {
  x: number;
  y: number;
  // ...
}

class Point {
  x: number;
  y: number;
  // ...
}
// Does `new Point() conformsTo Coordinate` pass or fail?

I think it's implied in your example that the checks would be less strict, and that these would pass, but I just wanted to confirm.

theScottyJam commented 10 months ago

Won't introducing a new keyword clash with existing code which is currently using it as an identifier? Is there an existing keyword which can be reused? Or is this not really a problem in because the context of type annotation is distinct enough?

As far as I can tell, it shouldn't be an issue due to the fact that this keyword will only be used inside of the type-annotation space, which is currently a syntax error. If it is an issue, we'll just need to come up with something else (a real keyword or special characters or something)

theScottyJam commented 10 months ago

I'm curious how strict you're thinking conformsTo and assert should be?

This would be the kinds of debates we'd need to have as we spec out how the "official TC39 type system" would work.

If we wanted to follow in TypeScripts footsteps, then { x: 1, y: 2, color: 'green' } would indeed conform to a Coordinate interface and new Point() would be a type of Coordinate (as long as these classes don't have private fields).

I personally prefer Hegel's approach to this, where they have your object types be "strict" by default, but you can opt into making them "soft". So, in Hegel, { x: 1, y: 2, color 'green' } would not conform to Coordinate, unless Coordinate were written like this instead: type Coordinate = { x: number, y: number, ... } (the ... makes it a "soft" type). This, to me, is very useful, because if I have an endpoint controller that returns a specific object type, I really don't want extra properties accidentally slipping into there, or likewise, I'd prefer not to accidentally send extra parameters when making a REST request. Hegel also does nominal typing, so a Point instance would not conform to a Coordinate instance.

spenserblack commented 10 months ago

Makes sense!

Perhaps this should be saved for a future discussion, but I was asking because I wanted to raise a possible con: that introducing these could result in overuse and overly strict code. IMO this isn't a big problem, but I just wanted to mention because, anecdotally, I've had to deal with annoyingly strict code. Personally, when I was a beginner I would write some overly strict code, and I know that past me would have overused these keywords if they were available :laughing:

orta commented 10 months ago

FWIW, there is consensus from folks in TC39 to not do runtime semantics for this proposal from the March 22nd 2023 meeting. So it's likely if you want to do something like this you really need to start thinking about writing a new proposal which would be competing with this proposal in the space.

From the notes:

The three browsers expressed that runtime type checking in type annotations would be unacceptable.

No one advocated for semantics other than type erasure.

Personally, I'd say you have an uphill battle with this and #183. You probably to test out your new type system in user-space, and persuade folks to adopt it before starting to approach standardization

theScottyJam commented 10 months ago

Yeah, that was what the original post was about - the fact that they had this meeting, argued that doing automatic runtime semantics with all types was a bad idea, and then got general agreement not to pursue this path (though they did say "this isn’t to express endorsement or strong resolution" when asking for agreement). And I agree that if my only options were "runtime semantics everywhere" or "type erasure everywhere" (the two paths they presented), the second would be better as well.

I don't feel like it's too late to discuss the topic of opt-in runtime semantics though since, as explained in the original post, it (arguably) handles every single drawback runtime semantics had (at least the drawbacks that were presented), it wasn't a path that was considered during their meeting, and their concensus was not meant to be "endorsement or strong resolution".

I mean, yes, the fact that they already had this meeting and gathered feedback does make the task more difficult. But, I'm still hopeful they'd be willing to reconsider.

egasimus commented 10 months ago

I actually quite like the overall idea of strictly opt-in runtime semantics.

IMHO, if the proposal moves forward, it's quite likely to be in a form that breaks backwards compatibility. Might as well go the full 10 yards and add optional type checking of arguments.

Incidentally, more suggestions for keyword: matches, is.

assert is likely to be used as an identifier (e.g. import * as assert from 'node:assert'), and conformsTo/conformsto seems kinda awkward.

theScottyJam commented 10 months ago

Yeah, any of those would work for me, and that's a good point about the assert word already being used commonly as a variable name. I (currently) don't have strong opinions about the keywords that gets used, I'm just hoping that something happens along these lines.