tc39 / proposal-type-annotations

ECMAScript proposal for type syntax that is erased - Stage 1
https://tc39.es/proposal-type-annotations/
4.27k stars 47 forks source link

Carve less syntax out, with more generic, token-level comment syntax #80

Open theScottyJam opened 2 years ago

theScottyJam commented 2 years ago

I know a big push for this proposal is to try and put as much TypeScript syntax into JavaScript as possible. But, it could be worthwhile to explore what this proposal would look like if we didn't focus so heavily on this objective. Considering the fact that most users will have to run a codemod anyways to change their TypeScript code to be valid JavaScript, I don't think it's that bad of an idea to stray a bit further from current TypeScript syntax.

Let me propose a much simpler form of this proposal, that tries to carve out much less syntax, while still being ergonomic to use. All I'm going to do is introduce a simple token-level comment to the language, which works as follows:

We can also add a "@@" syntax, that will cause everything to the end of the line to be ignored, plus, if any opening brackets were found in that line (via "(", "[", "{", or "<"), further content will continue to be ignored until a closing bracket is found. Examples for this will be demonstrated below.

(The "@" character can of course be bikeshedded. I know it's currently being used by the decorator proposal, but we could have decorators use something else, like a two-character token).

Here's what it looks like in practice:

// The "@" and "string" following it are both ignored
let x @string;

function equals(x @number, y @number) @boolean {
    return x === y;
}

// Everything after @@, until the end of the line is ignored.
// Also, since "{" was found, all content within the { ... } grouping is also ignored.
@@interface Person {
    name @string; // "@" doesn't have to be used here since this is all ignored, but for consistency, it is.
    age @number;
}

@@type CoolBool = boolean;

class MyClass {
  name @number;
}

// More complex types can be wrapped in parentheses
function fn(value @(number | string)) { ... }

// optional parameters
// Notice how after the "?" token, there's a ":", causing the colon and the token after it to be ignored as well?
function fn(value @?:number) { ... }

// import/export types
@@export interface Person { ... }

@@import type { Person } from "schema";

import { @type: Person, aValue } from "...";
// After the ignored content is removed, this line will look like this: import { , aValue } from "...";

// type assertions
// Notice, again, how the ":" after "as" causes further content to be ignored.
// Also note that the "@" before "number" isn't strictly necessary as everything within the { ... } is being ignored,
// but it's being used anyways for consistency.
const point = JSON.parse(serializedPoint) @as: { x @number, y @number };

// Non-nullable assertions
document.getElementById("entry")@!.innerText = "...";

// Generics
function foo@<T>(x @T) { ... }

// "this" param
@this: SomeType function sum(x @number, y @number) { ... }

// Ambient Declarations (currently being considered to not be included to keep the proposal smaller, but now it can be trivially added)
@@declare let x @string;

@@declare class Foo {
    bar(x @number) @void;
}

// Likewise, function overloading might not get added to this proposal as it currently stands, but it's trivial to add with this idea
@@function foo(x @number) @number
@@function foo(x @string) @string;
function foo(x @(string | number)): @(string | number) {
    ...
}

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  @protected @readonly x @number = 2;
}

// Allowing someone to "implement" an interface
class MyClass @implements: MyInterface { ... }

So, yes, perhaps some of those examples aren't as nice-looking as the TypeScript variants, but they're not bad either, and most of them seem equivalent verbosity-wise. And, remember, this is just showing what's possible if we choose to only introduce these simple rules, we could still choose to go with a mix of the current proposal and this idea, where we use the "@" and "@@" syntax for most items, but we also add, for example, a no-op "as" operator so people can write x as yz instead of x @as: yz.

There's some other benefits if we go this route:

The downside here is that this looks very different from TypeScript syntax, which I know can be a bit off-putting. Especially the fact that it can really be hard to call this sort of thing "TypeScript" when it looks so different form TypeScript. Though, IMO, the up-sides outweigh this downside, though I would be interested to hear other thoughts on this matter.

Update: It has been mentioned that we can't use the @ token - we won't be allowed to take that out of the hand of the decorator proposal. I've put together another iteration of this idea which uses different tokens, and encorporates other feedback that's floated around. I still feel there's more room for improvement (e.g. I'm not a giant fan of the back-slash character, I'm still mulling over alternative ideas), but I think it takes a couple more steps in the right direction. You can see it presented in this comment.

simonbuchan commented 2 years ago

Teachability is absolutely a concern, I'm just not sure that the current situation is all that much better, that any harm would exceed the benefits, even for the exact same situation (this learner at the moment would get a syntax error, with the proposal could know they don't care about types... so they can ignore them)

I've seen nothing about how this proposal is, overall, a value-add for JavaScript itself.

I didn't want to get into the weeds in the rest of this reply, but it can absolutely be used for tooling other than type checkers. A lot of people currently write JSDocs just to put type annotations to make editing nicer, as a simple example.

danielearwicker commented 2 years ago

Right now, every syntactic feature that TypeScript wants needs to go through the TC39 proposal process and be approved

The proposal is that static type checker designers will restrict themselves in the future to only defining new type system syntax that will be skipped by whatever this proposal ends up defining as the rules for finding the end of a static type declaration comment. If it’s defined loosely enough, e.g. very naively, something like matching nested brackets (of several kinds) except within quotes (of several kinds), then that gives huge scope for future language designers to invent new syntax forms that JS will happily ignore. But the onus will be on those folks to stay inside the rules, not constantly pester JS to add new specific rules every time.

The only reason for the original proposal is that a specific syntax is now in widespread use, so it’s only valuable to make JS tolerate and ignore that specific style of syntax, only because it is already in use everywhere in the world now, in Typescript mostly, to a much lesser extent in Flow and presumably in hundreds of other incubating experimental type checkers with very similar syntax but altered semantics. Therefore the challenge is to come up with the simplest rule for skipping that broad family of syntax.

Inventing an alternative syntax that, in retrospect, would have been easier to skip, is a theoretically interesting project.

Finding a way to skip an already widely-used syntax, such that the rules are simple, general, easy to implement and explain, and leave lots of scope for creative extension without needing to change the rules, is a practically valuable project (and hopefully will be theoretically boring, if it has a chance of succeeding!)

A wider point raised here is why JS should add this, given that it’s valuable to TS (et al) but not to JS.

One of the ever circling dangers for JS has been that it might become a compiler target rather than something people actually code in. A lot of credible waves of that attack have happened: coffeescript, GWT, asm.js/emscripten. This has settled down into two extremes:

Note that this is still a work in progress. TS got async-await first, then it was absorbed into JS, but TS retains some other vestiges that have no JS runtime equivalent and so require codegen; these will have to die out over time. (TS gradually becoming erasable is an example of it adopting a neat feature of Flow, not for the first time.)

The trend is for TS syntax to show up everywhere that people use JS, in teaching, in blogs, in docs, because it is so damn useful for communicating facts about the code. If JS pretends like this isn’t happening, it gradually stops being the most widely used language in the world, because (almost) everyone will actually be using TS and then stripping it out to get browsers/node to run it.

If JS finds a way to accommodate this established extra syntax, then all such languages will not just be supersets of JS - they will be JS. It will make JS’s future more secure. By being flexible, it won’t ever be broken.

ckruppe commented 2 years ago

@simonbuchan thx for your explanation. I see your point now and this really helped me understand some reasons behind it. I think for me not using it is also more an personal preference thats also responsible why i learnt the C family of languages in the first place and never got in to python for example. For this reason i got into dart. On personal projects, if its an non react project, i write dart for types and compile it down to js for the time beeing. :) About the assignment focus, i think your right its more important. But in the actual typescript syntax i see the same problem. const myVar : ThisIsAReallyLongTypeForThisThing = 'value' seems to have no advantages in this context over: const ThisIsAReallyLongTypeForThisThing myVar = 'value'

The union type problem and the typescript specific reasons, on the other hand are really viable reasons.

For cognitive complexity i can at least say many of our junior devs have real problems to get into typescript if there are lacking in vanilla js skills. But it looks to me like the only downside to the advantages of convertig ts to js and the other way around.

theScottyJam commented 2 years ago

@danielearwicker

Finding a way to skip an already widely-used syntax, such that the rules are simple, general, easy to implement and explain, and leave lots of scope for creative extension without needing to change the rules, is a practically valuable project (and hopefully will be theoretically boring, if it has a chance of succeeding!)

I assume, when you say this, that you currently don't feel like the current proposal (as presented in the README) fits this bill, and needs some polishing up to become more general and flexible. It also sounds like, from your comment, that you're not a big fan of diverging that far from TypeScript, which is fine (and probably means you're not a big fan of the idea I presented either). I don't think it's possible to have both flexibility and backwards compatibility like you're requesting here.

I'm not sure if you've actually read through this whole thread or not (there's been a ton of discussion), but I'll point you to my most recent iteration of the idea being presented (found here), I decided to adopt the : syntax, and allow it to be used wherever a binding is happening, exactly how the main README explains it. I then proceed to try and find general ways to accommodate everything else. I still feel there's room for improvement on this idea, and I'm mulling over some other alternatives (in particular, I'm not the biggest fan of the backslash character), but, for me, it seems to strike a good balance between backwards compatibility and being a flexible solution.

Since this latest iteration adopts the root proposal's system for inline type annotation, the only difference this idea has with the root proposal is the fact that sometimes you need to put some sort of sigel (like \ or ::) next to a type-specific keyword. And, I think this will be the trade-off we have to accept. We must choose between:

  1. Creating a bunch of special-case type-specific operators/keywords/constructs (like as, implements, readonly, public, private, protected, interface, declare, etc, etc). This has the advantage that it's more backward compatible with TypeScript, and it looks nice, but it has the disadvantage that the set of keywords and operators will be fixed - a type system can never introduce a new one (without going through the TC39 proposal process).
  2. Allowing anyone to come up with whatever operators/keywords/constructs they want, as long as they put a sigel next to that word. (like \as\, \implements\, \readonly, \public, \private, \protected, ::interface, ::declare, etc, etc). This gives type-systems the flexability to add new operators, keywords, and constructs whenever they want, but we lose some backwards compatibility, and it doesn't look quite as nice.

I think those are the two options we have. For option 2, we might be able to find alternative syntax and rules than what I proposed to make it a little nicer, but no matter what, if we want flexibility, there's going to need to be some sort of sigil attached to these keywords, which in turn breaks backwards compatibility.

Anyways, I'm open to hearing your thoughts to see if you can think of anything that can make the root proposal more flexible while preserving backwards compatibility.

simonbuchan commented 2 years ago
  1. it has the disadvantage that the set of keywords and operators will be fixed - a type system can never introduce a new one (without going through the TC39 proposal process).

But ... why is that a disadvantage? In fact, isn't it a huge advantage to both parties? To discourage type systems from adding syntax that might collide with ECMA work, and having to either get ECMA consent or be clearly be "out of bounds" if there's a collision in the future protects the users of both typed and untyped JS, so it's not a each side blaming the other situation. If anything, this is arguing for more syntax to be made available.

For example making the access modifiers valid but ignored would probably be considered sufficiently confusing and dangerous that it would not be accepted (I believe it's not currently in this proposal?) - but typescript's use of them (from the messier pre-release version) presumably meant that ECMA essentially couldn't when they added privates because the semantics for what they wanted were different to typescript's.

That sort of situation is exactly what the proposal fixes (going forwards), and it's probably more useful than the "don't have to compile for dev" examples it gives. (Now, not having to compile for scripts, on the other hand, is so fantastic I will strangle a baby for it)

simonbuchan commented 2 years ago

@ckruppe

But in the actual typescript syntax i see the same problem. const myVar : ThisIsAReallyLongTypeForThisThing = 'value' seems to have no advantages in this context over: const ThisIsAReallyLongTypeForThisThing myVar = 'value'

It is more about what you see first, compare:

const myVar : ThisIsAReallyLongTypeForThisThing = 'value'
const otherVar : ShortType = 'value'

and

const ThisIsAReallyLongTypeForThisThing myVar = 'value'
const ShortType otherVar = 'value'

In the first case you can quickly see the two identifiers, in the second you can quickly see the two types. Generally, the former is more useful because you make more identifiers than types. The cost of being harder to scan for types is both less of an issue (eg, you generally care about the type for a particular variable, not the set of them used in a function), and less bad, because identifiers tend to be much shorter.

But yeah, overall it's pretty minor. It's just an example of C syntax not actually being better in isolation (not considering familiarity).

jethrolarson commented 2 years ago

What I hope is accomplished here is that this spec turns into an annotation system that can be used for a type system or to add other arbitrary metadata to js code. With the right syntax you could implement a type system like:

const ::ThisIsAReallyLongTypeForThisThing myVar = 'value'

or

const myVar::ThisIsAReallyLongTypeForThisThing = 'value'

This choice would be the domain of the annotation consumer and how the type language works can be outside the purview of TC39.

E.g. I could see using such an annotation system data to implement macros, run-time types, automated unit test generation, code coverage markers, and such

simonbuchan commented 2 years ago

@jethrolarson a good summary of this issue.

Here's my summary of my two problems with it:

simonbuchan commented 2 years ago

E.g. I could see using such an annotation system data to implement macros, run-time types, [...]

On the other hand, the runtime behavior here is not possible. You can't both ignore and add behavior.

jethrolarson commented 2 years ago

@simonbuchan That's where transpilers come in, though. The code would be ignored if just run as is, but if pre-processed the annotation consumer could reimit the code changed in arbitrary ways. Maybe only for dev environments, for example.

theScottyJam commented 2 years ago

Actually, this is a pretty interesting thought. If we generalize it enough, then other JavaScript supersets could use the carved out syntax as a place to stick their extra features, without fear that future expansions to JavaScript will cause issues with the extra syntax this superset language is adding. So, yeah, perhaps this carved out syntax space could be a place for transpilers to add behaviors that affect runtime as well. Don't know if people would actually do this or not, nor do I think it needs to be an explicit goal, but that's an interesting thought.

@simonbuchan

it has the disadvantage that the set of keywords and operators will be fixed - a type system can never introduce a new one (without going through the TC39 proposal process).

But ... why is that a disadvantage?

This is basically just me restating the 80% coverage issue you gave earlier, where you felt that type systems (that are built to be similar to TypeScript) will only be able to get roughly 80% of their features in the carve-outs provided by the main proposal. This is a disadvantage (even if we feel the advantages outweigh this disadvantage, it's still a disadvantage).

In fact, isn't it a huge advantage to both parties? To discourage type systems from adding syntax that might collide with ECMA work, and having to either get ECMA consent or be clearly be "out of bounds" if there's a collision in the future protects the users of both typed and untyped JS, so it's not a each side blaming the other situation. If anything, this is arguing for more syntax to be made available.

I'm not exactly sure what you're saying here. If you're required to use a sigil (like \ or ::) to mark where type-related syntax is, then I don't know how a collision would ever occur. Type-parsers use the sigils for all type-related features, and EcmaScript will not use the sigil for any features that provide runtime effects. Maybe you're talking more about a conceptual collision, where TypeScript has an \as\ operator, and EcmaScript wants to introduce an as operator? It would be a shame if this sort of thing happens, but I don't think it's a show stopper at all. After all, existing code might also have as variables defined within it (as is not a keyword), also causing these sorts of conceptual conflicts, but EcmaScript is ok with this sort of overlap anyways. Plus, it's crystal clear to know that \as\ has something to do with your chosen type system, while as has something to do with the native language.

simonbuchan commented 2 years ago

I'm not exactly sure what you're saying here. If you're required to use a sigil (like \ or ::) to mark where type-related syntax is, then I don't know how a collision would ever occur.

You've lost the context. That was a reply to you arguing against using the existing syntax. I was saying that getting the existing Typescript syntax inside a "ECMA blessed" space means that TS and JS have specific agreement on what each other can use, meaning that further extensions having to go through ECMA is in fact a good thing, not a bad thing.

theScottyJam commented 2 years ago

Ah, I think I see what happened, I must not have clearly expressed myself. When I was talking about the disadvantages of the two options, I was explaining what their advantages/disadvantages were with respect to each other. i.e. The root proposal's advantage is that it's more compatible with TypeScript syntax, while this proposal's advantage is that it's more flexible (and avoids the 80% issue). So, when you said the root's proposal disadvantage wasn't actually a disadvantage, but was actually an advantage because it avoided conflicts, I got confused, because I understood that to mean "it's advantage over what I proposed".

But, if I understand correctly, you were just talking about the root's proposal's advantage over what we have today, i.e. The root proposal forces TypeScript to live within the constraints layed out by EcmaScript if they want JavaScript engines to be able to run their code, which in turn means they'll never introduce new syntax that would overlap with what EcmaScript wants to introduce. Which, I whole-heartedly agree, this would be an advantage that both the root proposal, and this token-level comment proposal have in common over our current situation.

j-f1 commented 2 years ago

There is a language that uses :: for type Syntax: Pyret

fun square(n :: Number) -> Number:
  n * n
end

Having used it, it isn’t that annoying to have to type an extra : and space.

matthew-dean commented 2 years ago

@danielearwicker

The trend is for TS syntax to show up everywhere that people use JS, in teaching, in blogs, in docs, because it is so damn useful for communicating facts about the code.

I don't know that we can safely say that's the trend. Some JavaScript devs don't use TypeScript and never will. As to TypeScript syntax being useful for communicating facts about the code, I would say: to whom?

For example, I 100% agree that (a: string, b: string) is useful and self-documenting.

But, when it comes to something like this...

 component<VC extends VueConstructor>(id: string, constructor: VC): VC;
  component<Data, Methods, Computed, Props>(id: string, definition: AsyncComponent<Data, Methods, Computed, Props>): ExtendedVue<V, Data, Methods, Computed, Props>;
  component<Data, Methods, Computed, PropNames extends string = never>(id: string, definition?: ThisTypedComponentOptionsWithArrayProps<V, Data, Methods, Computed, PropNames>): ExtendedVue<V, Data, Methods, Computed, Record<PropNames, any>>;
  component<Data, Methods, Computed, Props>(id: string, definition?: ThisTypedComponentOptionsWithRecordProps<V, Data, Methods, Computed, Props>): ExtendedVue<V, Data, Methods, Computed, Props>;
  component<PropNames extends string>(id: string, definition: FunctionalComponentOptions<Record<PropNames, any>, PropNames[]>): ExtendedVue<V, {}, {}, {}, Record<PropNames, any>>;
  component<Props>(id: string, definition: FunctionalComponentOptions<Props, RecordPropsDefinition<Props>>): ExtendedVue<V, {}, {}, {}, Props>;
  component(id: string, definition?: ComponentOptions<V>): ExtendedVue<V, {}, {}, {}, {}>;

...the usefulness is no longer in the readability / self-documenting nature of the syntax. Yes, it's useful on the TypeScript side, for type-checking. But at a certain point, TypeScript becomes machine-readable more than person-readable. You can, of course, hide away some of the much more complex types. But we shouldn't pretend that anything about it communicates something to a JavaScript user. It's pure noise.

If JS pretends like this isn’t happening, it gradually stops being the most widely used language in the world, because (almost) everyone will actually be using TS and then stripping it out to get browsers/node to run it.

I feel like there is no danger of JavaScript dying out, regardless of the outcome here. It's easy if you're in a TypeScript ecosystem to assume that everyone is using TS or on their way to using it, but plain JavaScript usage still far exceeds TypeScript usage.

matthew-dean commented 2 years ago

@simonbuchan

Why would an existing Typescript developer want to use something that means they need to update all their code? You need to demonstrate more benefit the more change you're asking. This is more important than you might think, because the people who want to use types already have an option, and nobody wants to spend time working on tooling that nobody is using, so there's strong network effects. If nobody uses this, then you just made this syntax unusable for no benefit. The Typescript syntax has the obvious win here, and with some small care it can be made usable by other existing and future languages.

This is ultimately a proposal for JavaScript. I think it's the wrong goal to think about it in terms about what's easiest / of the most utility to TypeScript. I wouldn't frame it as: what would make a TypeScript author of .ts want to move to just .js because I don't feel the authors really address that (or even propose it)? (Because, with a fast enough compiler, it's not that complicated / hard to just use .ts and transpile? Plus this proposal would still only cover a subset of current TS, so it's not usable for everyone.) I think it's more like, what would make it easier not for the.ts user, but for the JSDoc-style type-checking .js user (or other type systems using comments)? That is, if you want to write JS that's type-checked by TS, but is instantly runnable without transpiling, for whatever reason, could there be a system that's better / more flexible than JSDoc. This proposal IMO shouldn't be about: could we make a system that makes zero reason to have .ts files, because I don't think it would ever get there.

jethrolarson commented 2 years ago
  • where does the :: end? Near as I can tell, it's hard to reserve syntax that's nice to use (so that anyone uses this rather than a transpiler) and possible to parse, which means (again, as far as I can tell) you kind of need to define some set of possible syntax probably as big as the current proposal. At which point, why not use the simpler, unprefixed, in use syntax?

I have a proposal for that here https://github.com/jethrolarson/proposal-types-as-comments/blob/generic_annotations/README.md

simonbuchan commented 2 years ago

This is ultimately a proposal for JavaScript.

And proximally a proposal for users of types-in-JS. If you're debating the value of a change to JS, don't you think it's appropriate to think of who's going to use it and why? A Typescript user is not the only person who might use this, but they are a primary use case. If you are going to drop them as users then you make this proposal far less valuable.

I wouldn't frame it as: what would make a TypeScript author of .ts want to move to just .js because I don't feel the authors really address that (or even propose it)? (Because, with a fast enough compiler, it's not that complicated / hard to just use .ts and transpile?

As someone who has lost multiple weeks of their life to trying to get various flavors of ts-node configuration, multiple tsconfigs with carefully tuned includes, editor confusion about that, the current completely broken state of typescript node ESM, dodgy source mapping, the voodoo rituals to try to get various flavors of caching working? Having to write a webpack or roll-up config just to strip types, given how good their defaults are now? Uh huh. How nice for you.

This would significantly improve my experience, and I'm in a better position than many to make this less painful than it could be. Even the trivial case of writing a loose script to automate something and not needing to pick between the poisons of the garbage jsdoc syntax, installing packages, or having my editor completely confused about what's going on would on its own be enough to justify this. And I sure don't want to have to use maybe less garbage completely novel syntax just because you have, as best as I can tell, a theoretical issue with the idea of the proposed syntax, not even any concrete specific issues.

I think it's more like, what would make it easier not for the.ts user, but for the JSDoc-style type-checking .js user (or other type systems using comments)? That is, if you want to write JS that's type-checked by TS, but is instantly runnable without transpiling, for whatever reason, could there be a system that's better / more flexible than JSDoc.

You're saying that only people writing .TS files are typescript users, and the people using typescript to check their .JS are not? But even that distinction disappears with the original proposal, because it's only a syntax change between the two. The only real reason to use checkJs in Typescript is because it's that important to not have to introduce a build step. Guess what this proposal addresses?

This proposal IMO shouldn't be about: could we make a system that makes zero reason to have .ts files, because I don't think it would ever get there.

I'm not sure what point this is trying to make? Nobody's suggesting that. If it's even 5% of TS users, that's still a win. If it's checkJs users lives become far nicer, that's a win. If it's everyone who writes a script can avoid all the headaches with ts-node, that's a win. If it's some new dev copy pasting some typescript into a browser console, that's a win. Not every proposal has to solve everyone's problem maximally.

That said, I think you're overestimating the actual missing features from typescript. The big ticket missing items from memory are:

So from what I can tell, you could absolutely just use this "80%" happily. But even if you can't ... that's can only really justify an argument to add whatever missing critical syntax in some form, not to introduce a completely new syntax to replace everything?

And nothing's stopping those missing pieces getting added, once they demonstrate value.

ljharb commented 2 years ago

@simonbuchan i use checkJs with tsc on babelified code - in no way to avoid a build step (which is unavoidable and imo is not a worthy goal to pursue), but because i don't want to author in typescript.

simonbuchan commented 2 years ago

@ljharb It's not at all unavoidable if you're not shipping to browsers. Or developing.

Also, I can't possibly imagine that preferring jsdoc syntax to Typescript is an at all common preference, it's obviously worse in every conceivable way other than it "being JS"? Is there some reason you have to believe that others would also "not want to author in typescript", assuming that this proposal landed?

I ask because every JavaScript user's complaints about Typescript I've heard boil down to either "I don't want to use types" or "the syntax is too confusing", but you're using types, and with a more confusing syntax.

ljharb commented 2 years ago

@simonbuchan not that it affects this proposal either way - but if typescript's JSDoc support was anywhere close to the capabilities of native TS syntax, I would likely use it on all of my open source libraries - that, and the JS semantics TS isn't capable of typing, are the only reasons I don't. I would not, however, switch to using "not JS" on any of those libraries.

If this proposal landed, and a type system existed that actually covered JS semantics, I would certainly prefer using the combo over similar syntax inside of comments.

kee-oth commented 2 years ago

@mathhew-dean I absolutely agree with you. This proposal should be focused on what's best for JS. Which isn't necessarily what's best for TS, regardless of TS's user base size.

@simonbuchan I use JSDoc because of TypeScript's shortcomings in the way I program. I've tried TS out plenty of times but I inevitably get blocked. As @ljharb mentions, there are "JS semantics TS isn't capable of typing". Modeling this proposal after what TS needs just doesn't make sense, regardless of user base size. Our starting point is a system that doesn't work for all of JS which is a really shaky foundation. TS should be used as a reference but not as the One True Way as that doesn't currently exist.

(JSDoc is) obviously worse in every conceivable way other than it "being JS"

You've made a sweeping claim that I don't believe is true. I can conceive of an advantage for JSDoc: you can easily add descriptions to parameters and properties. In TS, you'd just have to add JSDoc anyway (or TSDoc) or some other comment system. JSDoc can also be used for at least some of those semantics that TS can't be used for.

simonbuchan commented 2 years ago

@ljharb I'm a bit confused, does this describe your preferences accurately, best to worst?

If so, shouldn't the original proposal (as opposed to this issue's proposals in general) be workable for you, assuming your harbscript semantics don't need more annotation space than the current proposal? If it might, where do you feel it is likely to be missing something?

The existing syntax of the proposal is speculative, and up for discussion, from what I can tell. Hell, I would be perfectly fine with dropping parts of it, e.g. interface and !, and requiring e.g. type Foo = interface { } and as NotNull<_> or whatever, if someone had a grounded reason to object. I would also be fine with adding speculative space that seems like it would be generally useful for other types of languages and checkers (e.g. Haskell style leading types), if a convincing case would be made and a workable solution found. That's essentially why the proposal process exists, afterall! (Of course, it's the committee that needs to be convinced not me, I merely expect that they have very broadly similar expectations from their previous decisions)

To be clear, I have absolutely no issue with people not liking typescript semantics. It has plenty of things I hate (overloads are basically broken, lack of consistency about declaration merging, inability to declare intersecting environments (e.g. node and browser), and so on. It is sufficiently performant and existing for my purposes, so I use it, but I would drop it for a better option in a heartbeat. One of my hopes for this proposal is that it encourages new type checkers, afterall! So don't conflate the fact that the syntax starts to fit a reasonable subset of typescript the syntax with the use of typescript as the checker. From the proposal text:

Additionally, type checkers, such as Flow, Closure, and Hegel may wish to use this proposal to enable developers to use their type checkers to check their code. Making this proposal be only about TypeScript can hamper this effort. Amicable competition in this space would be beneficial to JavaScript as it can enable experimentation and new ideas.

@kee-oth In the specific use-case of JSDoc for using Typescript's checkJs, you have exactly the typescript semantics, just with far worse syntax. If you're talking about using clojure or something, sure, but that's not what I was replying to. JSDoc can still be used for, you know, documenting.

ljharb commented 2 years ago

@simonbuchan lol at calling "the way JS works" something arbitrary :-p but that sounds right? i'm not decided yet on what's workable for me. but it's not clear that it's possible to build a type checker that works with actual JS semantics, and if that's never going to be possible, then I suspect any syntax in this space is not a good idea whatsoever.

simonbuchan commented 2 years ago

I didn't mention arbitrary, let alone about the way JS works? Not sure what you're referring to there.

Of course, any static type checker is only going to accept a subset of valid semantics. That's kind of the point! Which it accepts is a matter of preference and programming style. As such, we currently have at least Typescript, Flow and (new to me, mentioned by the spec!) Hegel, all with reasonably different semantics but, notably, extremely similar syntax. Perhaps it's not the ideal syntax for some future super-type system, but it gets to choose between transpiling until it makes it's case for the syntax to be added, no worse than today, or it is one of the many cases of something not being perfect because we can't see the future.

theScottyJam commented 2 years ago

just because you have, as best as I can tell, a theoretical issue with the idea of the proposed syntax, not even any concrete specific issues.

Fair point. I'll try to make the 80% issue a little more concrete with a bit of research.

Let's imagine this proposal got dropped into JavaScript a few years ago. This means, we would have great support for TypeScript syntax at that time, but poor to no support for future syntax they've added since then. What would we miss out on? I went through the release notes between now (version 4.6) and version 3.6, which was released back in August of 2019. I looked for any new syntax they added, that could potentially cause issues if a proposal like this already landed that brought type syntax to JavaScript.

Some of the features they've added in the past couple of years pertain to type-annotation-related syntax (i.e. the stuff that goes after a colon). I found out from #105 that they do plan on being pretty rigid about the syntax of the types you use outside of brackets/parentheses (which I'll call top-level type syntax, since I want a name for it). The exact details of their tentative plans can be found in this grammar document. While they leave you room to do whatever you want inside of brackets, you're basically required to follow TypeScript syntax outside of brackets. This means, for example, if you want to have a type prefix in top-level type syntax, you have to use one of the explicitly defined prefixes available to you: readonly, keyof, unique, infer, and not. There's also explicit support for TypeScript's union and intersection types (via | and &), literals, array types (via someArray[] syntax), void, conditional types (via the exact syntax of something extends somethingElse ? type1 : type2), type predicates via x is y, function types, constructor types via new (...) => ..., etc, etc.

In general, this means TypeScript syntax is being blessed a lot more than I realized (the README made these type annotations sound much more flexible than what they really are. They're only flexible if you're using brackets). As an aside, this also causes me to retract my "second iteration" of what I had proposed here, since I was relying on the type-annotation syntax they presented in the README without realizing this syntax was so tied down to a bunch of specific TypeScript features.

And, while it's technically possible to add whatever syntactic features you want by requiring you to use parentheses, this technique suffers from a couple of issues:

  1. There wouldn't be any consistency on when parentheses are required and when they're optional. It basically boils down to if the specific feature you want to use was blessed by TC39 and added natively to the language or not.
  2. Using parentheses isn't that much more verbose then using regular comments. i.e. compare x: (readonly values[]) to x /*@readonly values[]*/

Anyways, the point of explaining this, is to make sure we have an understanding of how they plan on implementing type annotations, so we know how this will cause issues when newer type-related features come along.

So, with no further adieu, lets time travel back a couple of years, gets this proposal in using an earlier version of TypeScript, then see what features TypeScript will struggle to bring into JavaScript as time goes on.

Now for some stuff that's not directly related to type annotations. Here, things get even more interesting.

All of these features, except support for abstract constructor types, is explicitly present in the current proposal (as outline in their grammar document I shared earlier), and yet didn't even exist in TypeScript 2½ years ago. Think of all of the follow-on proposals TypeScript would need to have done in those 2½ years to bring these all in, and, think of all of the follow-on proposals they would continue to have to do to add more syntax in the coming years. How many of these proposals would get accepted? How many wouldn't? Are these features all general features that are meant for any type-interpreter, or are they just stuff that TypeScript wants? The other option is to just be conformable with basically using an older version of TypeScript's syntax in JavaScript, and be ok with not having the new shiny features, like, being able to specify that you just want to import types from somewhere, or, using the override keyword on class members. But, if we're resorting to this, we'd also need to acknowledge the fact that this proposal is so closely tied to the TypeScript's current syntax that it can't even support future versions of TypeScript, let alone provide good syntax support for other type-parsers like Flow, thus defeating one of it's own goals of being flexible enough to be type-parser agnostic, at least in terms of syntax. (Sure, Flow can have input in the design of this current proposal, but if Flow does things one way and TypeScript does things another, I can't imagine TC39 providing explicit syntax to support both ways, so only one way will win. You can't make both parties happy with syntax that's this rigid).

simonbuchan commented 2 years ago

@theScottyJam these are excellent points, and I agree completely regarding the current rules (given I've only glanced at them, but haven't taken the time to really look into them yet)

I was expecting and hoping for something a lot closer to Rust's macro rules, which is roughly "arbitrary token sequences up to a set of stop": https://doc.rust-lang.org/reference/macros-by-example.html#follow-set-ambiguity-restrictions

You couldn't do that too naively, after all types should be able to contain {}, but also in function return type position they should stop at the body opening brace. But surely that sort of thing is resolvable.

In any case, that is presumably a deliberately conservative first attempt, and can and will be improved (and specifics should probably be in a different issue)

simonbuchan commented 2 years ago

Note that #106 is the counter-argument, and seems pretty well researched at least.

acutmore commented 2 years ago
  • Template literal types were introduced in version 4.1, and look like this: x: `a${number}b`;. It's possible a past version of this proposal wouldn't have carved out syntax to make this possible as a top-level type, but it's also possible this wouldn't have happened, and a new proposal would be required to support this as top-level type syntax.

Hi @theScottyJam!

I’m fairly confident that string templates wouldn’t have been missed even if this proposal had been accepted before they were in TypeScript as strings clearly* require dedicated handling in the tokeniser as they can contain arbitrary sequences of characters.

You are 100% right that a core challenge for the designers of this proposal, like many TC39 proposals, is to try and find the balance of solving the known problems of today while keeping enough space for the future unknowns.

* clear to people who are familiar with writing compilers that is.

theScottyJam commented 2 years ago

@simonbuchan

But surely that sort of thing is resolvable.

Unfortunately, I'm not so sure. #106 does seem to be a good attempt at trying to generalize the grammar for type annotations, but unfortunately it has a handful of issues, including the issue with the conflicting { token (is it the start of a function body, or part of the type annotation?), and I don't see a way of resolving these sorts of issues while keeping the grammar generic. I layed out some thoughts over there. We'll have to see how that thread evolves.

In the meantime, I've noticed a couple of posts mentioning that their current plan for flexibility is to just wrap the type annotation in parentheses (as mentioned here), which worries me. I would prefer that either you don't have to use parentheses at all (i.e. they fully spec how type syntax works, even within parentheses - leaving us with no flexibility in syntax which isn't a good option), or they greatly cut back on the number of special-cases they're making to the type-annotation syntax, making it simpler, and thus easier for the end-user to know when parentheses are required (which also makes parentheses required more often). Route 1 will cause them to completely get rid of the idea of this proposal being friendly to different type-parser, which would probably kill the proposal. If they go the second route, then we've basically got the token-level comments proposed from this thread, perhaps with a couple more bells and whistles and some extra restrictions to where that comment is allowed to be placed, but nothing too complicated.

Anyways, all of this is contingent on the fact that I don't think they'll be able to find a more flexible syntax for type annotations. If they do, that'll be great, and I'll watch the #106 thread to see if people are able to come up with something.

dpchamps commented 2 years ago

@theScottyJam I'm inclined to agree with you. I'm still mulling over the impact of just wrapping everything in parens. It looks like you've been thinking about it much more thoroughly.

Like you, I would prefer to not have the parenthesis at all. The additional syntax "escape hatch" may be indicative of a design smell -- or perhaps an opportunity to refine and simplify. It's at the minimum mildly inconvenient to have to type these extra tokens in order to to construct any kind of type.

OTOH the type annotation symbol : is so ubiquitous in mainstream languages as well as academia (happy to back up these claims, but I don't believe them to be controversial), that I think departing from it would be much, much worst.

I would very much not like to see JS depart from the status quo here.

If we were to assume that the grammar was something like

TypeDeclaration :
  type BindingIdentifier TypeParametersopt = ( <anything-at-all-production> )

TypeAnnotation :
  : (<anything-at-all-production>)

It does somewhat beg the question why typescript / flow-specific productions would be entertained at all in the Type production. If this is going to exist, why even bother with any of the type-specific grammars at all?

ahejlsberg commented 2 years ago

If we were to assume that the grammar was something like

That is pretty much what is being proposed in the grammar. If you look at the PrimaryType production you see that three of the choices are ParenthesizedType, SquareBrackedType, and CurlyBracketedType, the contents of which can be any sequence of tokens as long as all bracketed constructs are balanced (see the BracketedTokens production for how that is accomplished). The key idea with this definition is that we only need to teach ECMAScript parsers about the top level syntax of types, i.e. the type syntax that isn't bracketed, and that any new syntax a type checker adopts can be accessed just by putting it in parentheses.

We could indeed go as far as saying all type annotations must be parenthesized, but I suspect most users would find that rather annoying--just as they find putting all type annotations in comments annoying.

dpchamps commented 2 years ago

We could indeed go as far as saying all type annotations must be parenthesized, but I suspect most users would find that rather annoying--just as they find putting all type annotations in comments annoying.

Yes agreed that it would be inconvenient. But I think the differences between type annotations in comments and enclosing type annotations in brackets are separated by a greater degree than just minor annoyance. Do you agree?

Anyways -- just to be clear -- I'm not necessarily advocating for the reduction of the grammar to ParenthesizedType, SquareBrackedType, and CurlyBracketedType. But it does seem strange to me to support what I will call (perhaps a bit flippantly) a special case such as ConditionalType, and stop there.

I probably need to refine the problem statement of https://github.com/giltayar/proposal-types-as-comments/issues/103, but this is essentially what I wanted to discuss there:

it would be nice if we thought about accommodating a wider variety of existing types within the grammar right now. So as to support future implementations without the annoyance of needing to construct bracketed types.

Let's push this notion of "no type system specified." I'm optimistic we can arrive at the best of both worlds.

ahejlsberg commented 2 years ago

Do you agree?

I agree it would be less of an annoyance, but an annoyance nonetheless. function foo(s: (string), n: (number)): (boolean[]) just makes you wonder what's up with all the parentheses.

But it does seem strange to me to support what I will call (perhaps a bit flippantly) a special case such as ConditionalType, and stop there

The grammar doesn't stop there. In fact it includes the entire top-level TypeScript type grammar as it currently exists.

jethrolarson commented 2 years ago

One thing I don't like about identifier: type is that : already has meaning in js so it's visually confusing when using TS. Particularly in conjunction with destructuring which iirc wasn't in JS when typescript was defining its syntax.

e.g.

const foo ({bar: {baz}}: {bar: {baz: string}}) => ...
theScottyJam commented 2 years ago

@ahejlsberg - You don't necessarily have to require parentheses around all types to make it more type-engine agnostic. We could, for example, make it so after the : exactly one token will be consumed and ignored. If that token is an opening brace, then it'll continue consuming tokens until the closing brace is found. Thus, your example would look like this:

function foo(s: string, n: number): (boolean[])

Perhaps, this is where it might be good to also mix in a bit of the idea being proposed in #84, which is to allow people to place the type of a declaration before the declaration. i.e. allow users to avoid verbosity by writting their function types like this:

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

I think I'm going to take another stab at reformulating what I originally proposed in this thread, using these rules. I think, as this discussion has progressed, we're gradually converging on something that's more and more user-friendly while still being agnostic to a specific type-engine implementation.

@jethrolarson - I wholeheartedly agree here. Maybe at some point I might open up a ticket to discuss this point specifically - seeing if people would be willing to swap out the : token for another one. We'll see.

dpchamps commented 2 years ago

@ahejlsberg

The grammar doesn't stop there. In fact it includes the entire top-level TypeScript type grammar as it currently exists.

Sorry, wasn't clear. I meant, "it seems odd to stop at the TypeScript type grammar as it currently exists." I was using ConditionalType as an example of this. I understand it goes wider in some areas, for example not stands out to me... though I believe this is already in the works with TS.

What I mean is: throughout the spec and issues, one of the goals is clear: "to be type system agnostic". There are other interesting types that other type systems offer which TypeScript does not currently provide that may require additional syntax. I think it's fine TypeScript doesn't provide them. But I'd like to help push the specification towards a grammar that is permissive of wider set of types conveniently.

As it stands, we can either write TypeScript (or Flow, or maybe Hegel) conveniently, or something else inconveniently -- or as we've both agreed, annoyingly.

Does that make sense?

I would liken this idea to the idea here: https://github.com/giltayar/proposal-types-as-comments/issues/80#issuecomment-1067735994, which I'd roughly summarize as "this proposal would have driven template literal types in the absence of their existence within the TypeScript type system."

jethrolarson commented 2 years ago

@theScottyJam the problem I see with this

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

Even though I proposed it I think it's forcing the annotation consumers into a certain way of specifying types. Generic rules around start-and-end of the annotations which can appear anywhere in the source just like comments gives the consumers more flexibility which should better stand the test of time.

::then
::(you can do) const ::whatever foo ::you = ::like 1 ::(to do) 
theScottyJam commented 2 years ago

@jethrolarson - perhaps, let me briefly explain what I'm envisioning.

  1. We support token-level comments (the @ from my original post), via something like the backslash character.
  2. We support line/block-style comments (the @@ from my original post), via something like ::.
  3. We allow users to place a : after any binding. This acts as a token-level comment as well. This isn't really needed since we already having the \ character serving this purpose, but it does make the code look a little nicer, and less like back-slash soup.

This means, if TypeScript chooses to support putting type-declarations on the line before, they can do so like this:

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

And it might be worthwhile for them to think about adding such a feature because 1. I'm selfish and I like it better 😄️, and 2. it could help with some of the verbosity of sometimes being required to use parentheses after the colon. But, no one is going to require them to do this, and they can get by just fine if they choose to continue relying on inline type-annotations, it'll just sometimes be a bit more verbose to use them.

The other rules I presnted would still enable this:

\then
\(you can do) const \whatever foo \you = \like 1 \(to do) 

And you can also do this:

function fn(x: number, y: (number | string)) { ... }

::interface {
  x: number
}

return x \as\ number

// etc...

I'm just picking the double-colon and backslash to do those jobs, because that's what I had used in my earlier revision, and I think they look nice - relatively speaking. I'm also good with treating the double-colon as the token-level comment character instead, and finding something else to handle line/block-style comments. I know you mentioned earlier that this could already be done via ::(lots and \n lots of \n content) (assuming the :: is the token-level comment), which is certainly an option that I would be ok with. I just like the looks of these blocks of syntax without the extra nesting caused by the parentheses, which is why I'd prefer we additionally had a prefix to indicate this is a larger block and everything in it should all be ignored. But, either way - I think a proposal that only had :: in it as a token-level comment is already extremly powerful and may be able to get us very far.

jethrolarson commented 2 years ago

I'm wary of using backslash because of all the tooling around code and backslash is an escape character. Smells like weird bugs waiting to happen

On Tue, Mar 15, 2022, 3:11 PM Scotty Jamison @.***> wrote:

@jethrolarson https://github.com/jethrolarson - perhaps, let me briefly explain what I'm envisioning.

  1. We support token-level comments (the @ from my original post), via something like the backslash character.
  2. We support line/block-style comments (the @@ from my original post), via something like ::.
  3. We allow users to place a : after any binding. This acts as a token-level comment as well. This isn't really needed since we already having the \ character serving this purpose, but it does make the code look a little nicer, and less like back-slash soup.

This means, if TypeScript chooses to support putting type-declarations on the line before, they can do so like this:

::(s: string, n: number) => boolean[] function foo(s, n) { ... }

And it might be worthwhile for them to think about adding such a feature because 1. I'm selfish and I like it better 😄️, and 2. it could help with some of the verbosity of sometimes being required to use parentheses after the colon. But, no one is going to require them to do this, and they can get by just fine if they choose to continue relying on inline type-annotations, it'll just sometimes be a bit more verbose to use them.

The other rules I presnted would still enable this:

\then

(you can do) const \whatever foo \you = \like 1 (to do)

And you can also do this:

function fn(x: number, y: (number | string)) { ... }

::interface {

x: number }

return x \as\ number

// etc...

I'm just picking the double-colon and backslash to do those jobs, because that's what I had used in my earlier revision, and I think they look nice. I'm also good with treating the double-colon as the token-level comment character instead, and finding something else to handle line/block-style comments. I know you mentioned earlier that this could already be done via ::(lots and \n lots of \n content) (assuming the :: is the token-level comment), which is certainly an option that I would be ok with. I just like the looks of these blocks of syntax without the extra nesting caused by the parentheses, which is why I'd prefer we additionally had a prefix to indicate this is a larger block and everything in it should all be ignored. But, either way - I think a proposal that only had :: in it as a token-level comment is already extremly powerful and may be able to get us very far.

— Reply to this email directly, view it on GitHub https://github.com/giltayar/proposal-types-as-comments/issues/80#issuecomment-1068518160, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH5YCQD44UVZMAVNYXGVLVAEDKZANCNFSM5QOCJWJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

matthew-dean commented 2 years ago

@theScottyJam So, I've been reading and thinking about the Stage 3 decorators proposal, and I came across the "annotation" syntax that TC39 themselves proposed for later exploration. I think with using the behavior and semantics of decorators, there's actually a path forward where annotations could be nearly anything, would not fundamentally change JavaScript syntax, and yet would / could provide any static-analysis type-checking that TypeScript / Flow / others might want.

Take a look! https://github.com/matthew-dean/proposal-annotations

matthew-dean commented 2 years ago

@theScottyJam Here is your original example, re-written with Stage 1 of the above proposal:

let @'string' x;

@'boolean'
function equals(@'number' x, @'number' y) {
    return x === y;
}

let @{
    name: 'string',
    age: 'number'
} Person 

let @'boolean' CoolBool;

class MyClass {
  @'number' name;
}

function fn(@'number' @'string' value) { ... }

// optional params
function fn(@'?number' value) { ... }

// import/export types
export let @{ ... } Person

impor { Person } from "schema";

import { Person, aValue } from "..";

// type assertions
const point = @{x: 'number', y: 'number' } JSON.parse(serializedPoint)

// Non-nullable assertions
(@'!' document.getElementById("entry")).innerText = "...";

// Generics
@'<T>'
function foo(@'T' x) { ... }

// "this" param
@['this', SomeType]
function sum(@'number' x, @'number' y) { ... }

// Ambient Declarations 
// omitted - up to the underlying type system

// Function overloading
// ommitted but a type system could define it using some structure of annotation

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  @'protected' @'readonly' @'number' x = 2;
}

// Allowing someone to "implement" an interface - just spitballing
@['implements', MyInterface]
class MyClass  { ... }