tc39 / proposal-type-annotations

ECMAScript proposal for type syntax that is erased - Stage 1
https://tc39.es/proposal-type-annotations/
4.23k stars 46 forks source link

Exposing types at-runtime #52

Closed roobscoob closed 2 years ago

roobscoob commented 2 years ago

At most, we could expose the types as strings, but it's not clear what anyone could do with those or how they should be exposed. This proposal does not try and expose the types as metadata, and only specifies that they are ignored by the JS runtime. Users who rely on decorator metadata could continue to leverage a build step as desired.

I just want to throw my hat into the ring here on this comment. I think exposing types could be very very beneficial.

Some of my major gripes in typescript at the moment are with some of the strict inabilities for typescript types to induce runtime behavior. I understand the scope of this proposal does not include this. However, this proposal allows for some amazing new possibilities.

There's already precedent for something like this, Function.prototype.toString() returns the comments included with the function. It would also allow for some transpilers to optionally include code with the built result for managing these types. Allowing full runtime type introspection.

There are also no major downsides. Adding this wouldn't affect performance at runtime. Arguably, the hardest part of this is working out how exactly to let the programmer access these type strings.

junoatwork commented 2 years ago

I agree- although this proposal avoids specifying the semantics of what the type annotations contain. Function.prototype.toString() is a good example.

I propose exposing type annotations via the newer Reflect api:

Treat annotations just as plain strings (if present) or undefined:

class Obj { 
  foo: string;
}
let bar: number | undefined;

Reflect.getType(Obj, 'foo') // => 'string'
Reflect.getType(bar) // => 'number | undefined'

Classes or objects could expose an "interface" - a dictionary of their enumerable properties and their types:

class Person {
  name: string;
  age: number;
  parents: Person[]
}
Reflect.getInterface(Person) // => {name: 'string', age: 'number', parents: 'Person[]'}
ljharb commented 2 years ago

Reflect is exclusive for Proxy traps; it is not intended for other reflection (it’s poorly named), and would not be permitted to hold this method unless it was a proxy trap also.

tirithen commented 2 years ago

Everything with this proposal is way too TypeScript centric. There are lots and lots of modern projects for various reasons not using TypeScript that would still benefit greatly for standardized type support in ECMAScript.

Using comments for this does not make much sense for those cases. Rather basing it on the syntax from other languages like Rust, Dart, Swift as well as of course great insights from TypeScript makes so much more sense.

A year ago or so private class methods and fields landed, it was a breaking change, but it works really well in all modern browsers, and can be transpiled down for legacy browsers.

A type system for JavaScript would have to use the same reasoning for all projects where native JavaScript is being successfully used, adding types via comments would just be a super clumsy way to declare types compared to Rust and other languages.

I can see that for TypeScript only developers this might look like a detail, but JavaScript is not a compilation target, it is a huge language successfully used as is.

When I started working professionally with web development, jQuery syntax was all the rage, then came CoffeScript. There are probably thousands of projects where developers are perfectly satisfied with continuing working with that, but most of us moved on as JavaScript kept evolving (as it should).

Today TypeScript is fast becoming all the rage, which is perfectly fine, and will probably lead to many more new insights for web development for some years to come.

After that, the next shiny JavaScript flavor will be super popular. That might come with new insights, and because of that we cannot use the current TypeScript compilation needs to drive the JavaScript standard.

JavaScript though, needs to take the good parts from each of these flavors (as well as other non-browser programming languages) and standardize them.

Standardizing specific syntaxes for comments, meant for humans and documentation tools, is not the way to go here.

Instead, put effort into defining a proper type syntax, and then work on a proposal for this. The TypeScript community has already started this work by creating that language.

Probably it can be further iterated on further until stable for some years to create a more proper proposal not using comment hacks. :)

ljharb commented 2 years ago

@tirithen class fields was in no way a breaking change; here isn’t the place to debate that, but that is a misleading claim.

roobscoob commented 2 years ago

Agreed with @ljharb this is an off-topic comment to this issue. Make a new one?

benjamingr commented 2 years ago

I vote "anything related to runtime" including an introspection API goes to a separate future proposal that can build on this.

justinfagnani commented 2 years ago

I agree there are some very good use cases for runtime type information. I think it may be untenable in combination with other goals and constraints, but might be worth listing them.

One is enabling type conversions. If code can read the type of a field say, it could perform type conversion to that type. For example serialization libraries, or decorators that convert from properties to attributes for custom elements.

Another is data validation. Libraries like zod exist for this, but they require you to write interfaces with their own runtime API:

// all properties are required by default
const Dog = z.object({
  name: z.string(),
  age: z.number(),
});

// extract the inferred type like this
type Dog = z.infer<typeof Dog>;

The another, more controversial one, might be runtime declaration validation. The first class protocols proposal proposes essentially an interface object where classes are checked against the interfaces they claim to implement. There are some reasons why this might not be desirable to VM implementors who have stated they want class creation to be as fast as possible, but it's a potential use case.

The are at least two major problems with runtime type information though:

  1. It's important the vast majority of the time to be able to erase types as an optimization. There would need to be a way to opt-in to RTTI at the declaration and class member level.
  2. Some use cases (like validation) do call for type information from type declarations. That would have to diverge hugely from types-as-comments, specifying new interfaces to represent the types.
wparad commented 2 years ago

Of course, but surely we don't need to list out out the benefits of types and their associated extensions, nor is this necessarily the place to discuss it. Runtime type usage can always be enabled optionally at a later point, or configurable by additional packages hooking into extension points in the language.

justinfagnani commented 2 years ago

Runtime type usage can always be enabled optionally at a later point

I don't think this so easy. I agree it could be added for classes and functions. I think it would be next-to-impossible for variables, and would only be possible for interfaces in as much as the interface body is specified. If interfaces bodies are comments, then all you would get is text and you would need a parser to do anything with it.

wparad commented 2 years ago

That's just the fundamental fallacy of scope creep. Just because it is hard later doesn't mean it should be done now. You would either have to prove it would be impossible, or that the work now is trivial to add in (and also that is unanimously agreed to). Otherwise we are unnecessarily bloating the proposal and preventing it from delivering value.

giltayar commented 2 years ago

Everything with this proposal is way too TypeScript centric. There are lots and lots of modern projects for various reasons not using TypeScript that would still benefit greatly for standardized type support in ECMAScript.

Just my two cents here: when we worked on the proposal, we looked at TypeScript, Flow, and Hegel, and tried to accommodate all of them in the proposal. Whether this survives to stage 4, I don't know, but I believe that the syntax should accommodate the leading JS type systems.

The "type space" (the area allowed for types) is VERY big, so you could theoretically build almost any type syntax you want in there, with minor compromise.

Jamesernator commented 2 years ago

The "type space" (the area allowed for types) is VERY big, so you could theoretically build almost any type syntax you want in there, with minor compromise.

Is having a large syntax space in itself actually largely beneficial to the feature as a whole though?

Although it is briefly mentioned in the README, I feel like Python's system does have a lot of advantages in that there is plenty of freedom in the semantics and interpretations of types, but the syntax for specifying them is clearly layed out.

Like I don't think it's a bad thing that there is some divergence from standard js syntax for some typing stuff, e.g. like we don't have to restrict ourselves to just what JS syntax already has available. And in fact we almost certainly should allow some things like union1 | union2 and Generic<TypeVariable>.

However it would be absolutely awesome if we could have a sufficient type syntax such that more complicated things could be represented while still allowing other things (such as runtime reflection) to be able to get benefits from the syntax as well.

As an example, when we look to how Python deals with this, they use their existing syntax to expose a few hooks like implementing generics with accessor syntax (e.g. List[T] and such). I feel like we should learn from this, while not completely throwing away the popular work that already exists.

Like do we need languages to be able to define some new super ::--->>> operator? The answer to that feels fairly obviously no, but the argument here isn't that type systems shouldn't have no freedom, it's more that it should be translatable into a common language.

For example, we can do a lot just using Generic<T> syntax, many higher order operators or things could be implemented just with it. Like we could just use Union<T, S> in place of T | S. However I'm not advocating for that strong of a restriction because I believe the syntactic benefit of allowing T | S is big enough to justify the flexibility. But we don't neccessarily need to extend that flexibility to "arbitrary possible DSLs" at the cost of losing common processing of types.

My feeling on the matter is it would be a lot better to follow a few basic principles:

giltayar commented 2 years ago

The question of how to parse the "type space" is a big one. I for one am all for leaving it flexible to operators like ::--->>> so that future experimentation with type systems will not be shackled by what we think is a good type system today (e.g. TypeScript).

And I believe that even TypeScript would not want to be shackled by the existing grammar too much, as they would probably be bound to stick within this grammar forever. So flexibility is paramount IMHO.

Jamesernator commented 2 years ago

The question of how to parse the "type space" is a big one.

The idea would be to parse it as an AST of abstract "type objects" that have no inherent runtime meaning, however the actual trees that different type systems AND runtime could use would be identical.

And I believe that even TypeScript would not want to be shackled by the existing grammar too much, as they would probably be bound to stick within this grammar forever. So flexibility is paramount IMHO.

This is the thing though, most of the things in TypeScript are not hugely new things invented solely for TypeScript, union types using |, record types using {}, and most of the usual features have been common in literature and implementations of type systems and programming languages for years and in many cases decades. A lot of this is not new territory.

Like we don't give arbitrary powers to define new syntax in the JS language itself, what sort've things are we expecting that such syntactic power is actually neccessary going forward in a lot of cases especially given that a fairly loose grammar would allow many patterns to be captured anyway.

e.g. Consider some of the more special syntax features that TypeScript has added (e.g. ones that aren't super common in literature/existing langs already like | is):

And I think that's all the major ones covered. The thing about these syntaxes is that they all still follow some internally logical structure. i.e. If we have { } we know that the inside is going to be a collection of key-like things with corresponding value-like things.

Another thing to note is most of these features are implemented just as a token list, consider mapped types specifically [P in keyof T as Exclude<P, "foo">]. Like this shouldn't require new grammar rules to add something like this, as we could just allow parsing a sequence of "type-ish" things. e.g. As an AST for the whole mapped type we might present a node like:

{
    "kind": "objectType",
    "properties": [
        { 
            kind: "computedKey" // Corresponding to the fact we're using [computedKey]: value syntax,
            key: {
                // There's no way to know the relation between sequential symbols, but we
                // can at least provide a sequence of individual symbols and better yet
                // sometimes "type-ish" nodes as well
                kind: "typeTokenSequence", 
                tokens: [
                    { kind: "name", name: "P" },
                    { kind: "name", name: "in" },
                    { kind: "name", name: "keyof" },
                    { kind: "name", name: "T" },
                    { kind: "name", name: "as" },
                    // Certain tokens might still be valid "type-ish" tokens so a full
                    // parse on such objects isn't neccessary
                    {
                        kind: "genericInstantiation",
                        genericType: {

                        },
                        typeParameters: [
                           { kind: "name", name: "P" },
                           // By providing token types we don't need to worry
                           // about people wanting to use type annotations to have
                           // to have a full tokenizer parser to be able to interpret
                           // the types
                           { kind: "stringLiteral", value: "foo" }
                        ]  
                    }
                ]
            },
            value: {
                kind: "indexedType",
                type: {
                    kind: "name",
                    name: "T"
                },
                index: {
                    kind: "name",
                    name: "P"
                }
            }
        }
    ]
}

This sort've thing would mean that type languages have a really good amount of freedom, but not absolutely unconstrained freedom.

And honestly this seems reasonable, the proposal already can't give type systems unconstrained freedom because for syntactic structures such as interface or type Name = to be able to exist at all, it is neccessary for JS language to actually specify these nodes as well.

And let's be clear here, this idea is not proposing that every bit of grammar be specified. But rather certain constructions particularly object types, generics and stuff, have a concrete AST representation. However in certain places basically a "well-bracketed" token list would be permitted which gives sufficiently large flexibility, but still allows runtime type systems to effectively function as they do not need to ship a full tokenizer, and parser to extract the contents.

In particular notice that most of the parse information needed to interpret the above type in some type system (statically or runtime) is mostly there. It just needs a small grammar to parse the token list, and this especially easy given the tokens are already there, so this isn't even a full parser, just a simple recursive descent parser will more than suffice.

giltayar commented 2 years ago

@Jamesernator this is a good conversation to have! What do you say on opening an issue specifically about this: about the level of "specificity" of the grammar. It'd be better than discussing it under an issue named "exposing types at runtime". 😀

benjamingr commented 2 years ago

I see people opening a lot of issues about what Python's types - it's worth mentioning I did check with the designers or Python's type system (namely: Guido) and basically:

He also raised very good points about why runtime type checks don't make a ton of sense - if I have an iterator/stream for strings there is really no way to verify that's really an iterator for streams without iterating (and destroying) it.

He also mentioned there are already places that do runtime checks in Python (which is strongly typed) just not for annotations - in JavaScript this is stuff like how 15n * 1.1 throws an error.


I feel strongly we should:

justinfagnani commented 2 years ago

@benjamingr

Expose runtime types for introspection as part of a future proposal (the Python way) after this lands

It's important to note that this isn't backwards compatible without an opt-in, and probably isn't tractable without a more prescriptive type syntax like what @Jamesernator is talking about.

benjamingr commented 2 years ago

It's important to note that this isn't backwards compatible without an opt-in, and probably isn't tractable without a more prescriptive type syntax like what @Jamesernator is talking about.

The part that isn't backwards compatible without an opt-in is type checking in runtime. Providing introspection APIs is possible with the syntax.

justinfagnani commented 2 years ago

Providing introspection APIs is possible with the syntax.

Not if types are erased. Minifiers need to know if they'll break reflective code by removing types.

roobscoob commented 2 years ago

Providing introspection APIs is possible with the syntax.

Not if types are erased. Minifiers need to know if they'll break reflective code by removing types.

This could be treated in a similar fashion to Function.prototype.toString. Minifiers can either expose configurations to allow type stripping.

Jamesernator commented 2 years ago

He also raised very good points about why runtime type checks don't make a ton of sense - if I have an iterator/stream for strings there is really no way to verify that's really an iterator for streams without iterating (and destroying) it.

Yes and no, while you can't check the iterator ahead of time, you can always rewrap i.e.:

function* typeCheckedIterator(theRealIterator, type) {
    for (const item of theRealIterator) {
        if (!validateTypeSomehow(item, type)) {
            throw new TypeError("...");
        }
        yield item;
    }
}

The main caveat is errors will be deferred, although I would still argue this is probably still preferable in a lot of cases than a type incorrect value getting through at all.

Jamesernator commented 2 years ago

So from the responses in #106, the type grammar is specified fairly tightly already. Exposing that AST to runtime (rather than strings) would be sufficient for most purposes that would want to generate runtime types.

i.e. If we got an AST for a node like Foo | Bar it would be easy to interpret the resulting AST as any type system we want at runtime. Really one of the harder parts of just having strings is tokenization which is comparatively nuanced to implement and excessive to ship (given the engine already has to tokenize these types anyway).

benjamingr commented 2 years ago

Can we close this as "Not in this proposal, perhaps in the future, important not to block the possibility to expose types at runtime eventually"?

benjamingr commented 2 years ago

Closing, this was already addressed by the README change on reflection: