Carve less syntax out, with more generic, token-level comment syntax

tc39 / proposal-type-annotations

ECMAScript proposal for type syntax that is erased - Stage 1

https://tc39.es/proposal-type-annotations/

4.27k stars 47 forks source link

Carve less syntax out, with more generic, token-level comment syntax #80

Open theScottyJam opened 2 years ago

theScottyJam commented 2 years ago

I know a big push for this proposal is to try and put as much TypeScript syntax into JavaScript as possible. But, it could be worthwhile to explore what this proposal would look like if we didn't focus so heavily on this objective. Considering the fact that most users will have to run a codemod anyways to change their TypeScript code to be valid JavaScript, I don't think it's that bad of an idea to stray a bit further from current TypeScript syntax.

Let me propose a much simpler form of this proposal, that tries to carve out much less syntax, while still being ergonomic to use. All I'm going to do is introduce a simple token-level comment to the language, which works as follows:

Simply place a "@" character before a token to cause the language to ignore the following token. i.e. in const x @number = 2, the @ will cause "number" to be ignored.
If the ignored token is a "(", "[", "{", or "<", then the language will ignore content until a closing bracket is found. i.e. in const x @(number | string) = 2, everything within the parentheses is ignored. These groupings can be nested.
If the ignored token is followed by a colon, then the colon and the token after the colon is ignored as well. i.e. in class MyClass @implements: MyInterface { ... }, the "implements" token will be ignored, and since there's a colon that follows it, the colon, and the word MyInterface will be ignored as well.

We can also add a "@@" syntax, that will cause everything to the end of the line to be ignored, plus, if any opening brackets were found in that line (via "(", "[", "{", or "<"), further content will continue to be ignored until a closing bracket is found. Examples for this will be demonstrated below.

(The "@" character can of course be bikeshedded. I know it's currently being used by the decorator proposal, but we could have decorators use something else, like a two-character token).

Here's what it looks like in practice:

// The "@" and "string" following it are both ignored
let x @string;

function equals(x @number, y @number) @boolean {
    return x === y;
}

// Everything after @@, until the end of the line is ignored.
// Also, since "{" was found, all content within the { ... } grouping is also ignored.
@@interface Person {
    name @string; // "@" doesn't have to be used here since this is all ignored, but for consistency, it is.
    age @number;
}

@@type CoolBool = boolean;

class MyClass {
  name @number;
}

// More complex types can be wrapped in parentheses
function fn(value @(number | string)) { ... }

// optional parameters
// Notice how after the "?" token, there's a ":", causing the colon and the token after it to be ignored as well?
function fn(value @?:number) { ... }

// import/export types
@@export interface Person { ... }

@@import type { Person } from "schema";

import { @type: Person, aValue } from "...";
// After the ignored content is removed, this line will look like this: import { , aValue } from "...";

// type assertions
// Notice, again, how the ":" after "as" causes further content to be ignored.
// Also note that the "@" before "number" isn't strictly necessary as everything within the { ... } is being ignored,
// but it's being used anyways for consistency.
const point = JSON.parse(serializedPoint) @as: { x @number, y @number };

// Non-nullable assertions
document.getElementById("entry")@!.innerText = "...";

// Generics
function foo@<T>(x @T) { ... }

// "this" param
@this: SomeType function sum(x @number, y @number) { ... }

// Ambient Declarations (currently being considered to not be included to keep the proposal smaller, but now it can be trivially added)
@@declare let x @string;

@@declare class Foo {
    bar(x @number) @void;
}

// Likewise, function overloading might not get added to this proposal as it currently stands, but it's trivial to add with this idea
@@function foo(x @number) @number
@@function foo(x @string) @string;
function foo(x @(string | number)): @(string | number) {
    ...
}

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  @protected @readonly x @number = 2;
}

// Allowing someone to "implement" an interface
class MyClass @implements: MyInterface { ... }

So, yes, perhaps some of those examples aren't as nice-looking as the TypeScript variants, but they're not bad either, and most of them seem equivalent verbosity-wise. And, remember, this is just showing what's possible if we choose to only introduce these simple rules, we could still choose to go with a mix of the current proposal and this idea, where we use the "@" and "@@" syntax for most items, but we also add, for example, a no-op "as" operator so people can write x as yz instead of x @as: yz.

There's some other benefits if we go this route:

This is probably the biggest pro: The proposal is much more flexible for new innovations. Right now, every syntactic feature that TypeScript wants needs to go through the TC39 proposal process and be approved, and for this to happen it needs to be considered important for all static type checkers. It's generally not easy to convince TC39 to add new syntax to the language, so this will be a giant bottleneck for innovation. We can already see how much of a bottleneck this will be, by seeing all of the features listed in the README that TypeScript currently supports, but they're considering not adding to this proposal in an effort to keep this proposal from growing too big. All of these features are trivially supported with the simple syntactic carve-out I'm proposing here.
It's trivial to learn what code has real runtime effects and what code does not. If there's a "@" involved, then it's a comment. Simple.
This is a minor point, but it's something that urks me about TypeScript, which this proposed syntax helps solve, and that's the fact that code that looks like types is very similar to code that looks like actual JavaScript stuff. Take this for example:
```
function fn({ x, y }: { x: Thing1, y: Thing2 } = { x: new Thing1(), y: new Thing2() })
```
The "type" part of that line of code, { x: Thing1, y: Thing2 }, is valid JavaScript syntax by itself. If you're quickly scanning the code, the only way to realize that this is a type, not a runtime value, is to notice it's followed by a tiny ":". Now compare this with what's being proposed here:
```
function fn({ x, y } @{ x @Thing1, y @Thing2 } = { x: new Thing1(), y: new Thing2() })
```
Ah, much better. The "type" part of that statement now actually looks different from everything else, it's much easier to scan.
Continuing with the previous point, TypeScript's choice of the ":" token has caused a lot of issues for them, because it's a token that already has so many meanings. Because this idea is not using the ":" token, we can rewrite the previous example in an even simpler way that TypeScript syntax can't support:
```
function fn({ x @Thing1 = new Thing1(), y @Thing2 = new Thing2() } = {})
```
If we ever want to add runtime behaviors to a type feature after-the-fact, we are now able to do so. For example, TypeScript can immediately support "x@!.y" syntax, and at a later point, JavaScript could add a "x!.y" feature that has runtime meaning. We don't have to figure out up-front which type-features should have runtime meaning with this proposal. (In a similar vein, if we ever decide to give JavaScript an actual, official type system, this proposal leaves enough syntax space to do so, the "official" type system can just use different syntax from the "@" syntax being proposed here.)

The downside here is that this looks very different from TypeScript syntax, which I know can be a bit off-putting. Especially the fact that it can really be hard to call this sort of thing "TypeScript" when it looks so different form TypeScript. Though, IMO, the up-sides outweigh this downside, though I would be interested to hear other thoughts on this matter.

Update: It has been mentioned that we can't use the @ token - we won't be allowed to take that out of the hand of the decorator proposal. I've put together another iteration of this idea which uses different tokens, and encorporates other feedback that's floated around. I still feel there's more room for improvement (e.g. I'm not a giant fan of the back-slash character, I'm still mulling over alternative ideas), but I think it takes a couple more steps in the right direction. You can see it presented in this comment.

wparad commented 2 years ago

I want to :clap: so much, but that isn't a valid reaction. We should start with the optimal solution and then figure out which compromises we can/should make. Trying to TS and other type systems put into JS, begs the question "Is TS and others actually the right solution". Personally, I hate them, don't use them, and it wouldn't be hard to define new ones that I think are better. Now while that isn't a reason to do it. Starting with the ideal and working back to reality is a much better strategy then starting with a flawed premise. TS and others can easily change to support whatever we want, so let's start with the aspirational goal.

theScottyJam commented 2 years ago

That's certainly an option.

I like "@", because it's a single token that's not too noisy, and it's going to be used very often (far more often than decorators). I believe it's the only ASCII character that's completely unused in the language right now, which makes it the only single-character token that can be used for this purpose. The decorator proposal could switch to using something like "%%" or whatever instead, they don't have to be as terse as this proposal.

Though, either way works.

ljharb commented 2 years ago

Decorators can't switch, for the same reason that decorators and private fields didn't swap sigils years ago - there's too many tutorials and blogs and example code out there using @ that it would be too difficult to update if they switched. Decorators, and nothing else, must use @.

theScottyJam commented 2 years ago

Oh, if anything I would think it's an advantage for decorators to switch, considering how many renditions of the decorator proposal there's been, and how many of those blogs and tutorials would be related to the older versions of the proposal.

Though, if this is really the case, we can find a different syntax to use here.

matthew-dean commented 2 years ago

Yeah, other than the @, which we can't use because of decorators, I'd been hoping for some push in this direction. I agree that something that's like a tag or note is ideal vs. a whole smorgasbord of TypeScript syntax and features which a JS parser / interpreter is just supposed to treat "as comments". There's a lot of cognitive overhead with that path.

I saw in the docs the % character. Obviously that's used as well, and it's hard to find an unused character (which is why JS proposals keep reaching into the grab bag for @ and #), so any character would have to have clear rules. You could also change placement to keep semantics clear. There's lots of syntax ways to do the same thing, like for example (spitballing on the OP code)

let x::string;

function equals(x::number , y::number)::boolean {
    return x === y;
}

::interface Person {
    name::string;
    age::number;
}

::type CoolBool = boolean;

class MyClass {
  name::number;
}

So, that is to say, despite liking TypeScript, I'm hoping for a refinement of the proposal to a single character sequence to define these "ignore this" pieces, akin to a 3rd type of comment (annotation) with different comment-ending rules. It keeps the whole proposal simpler and can still, I think, address the verbosity of JSDoc-style annotation.

lmcarreiro commented 2 years ago

What about the backward slash \?


let x \string;

function equals(x \number , y \number) \boolean {
    return x === y;
}

theScottyJam commented 2 years ago

Oh, yeah, I guess the back-slash isn't a used token yet either, so that's certainly an option. It looks a bit funky, but I think it's fine.

I am starting to like the "::" syntax more as well, after seeing it used in examples.

So, when it comes to syntax-bikeshedding this idea, there's three main things we'll need to be aware of.

A way to ignore a single token (or grouping) (originally I used the @ delimiter for this)
A way to ignore an entire statement/structure like an interface (originally I had @@ for this)
The ability to make type-related operators. Originally I added the colon rule for this purpose. The "as" operator could be written as @as:, and because a colon followed @as, the colong along with the subsequent token after it would be ignored as well. This felt nicer to me than prefixing each piece with an @ character, but the value of having specific support for this third point can certainly be debated.

So, if we decide to use the :: syntax to fulfill the first part, and try to do a straight translation of the other two rules, we'd be left with something like this:

// 1. Normal usage
let x::number

// 2. There's a third colon here, to make it ignore this whole statement
:::interface MyInterface { ... }

// 3. Note the single colon after "implements" to make the ignoring continue on to MyThing
class MyClass ::implements: MyThing { ... }
x ::as: y

The syntax for point 2 and 3 isn't ideal here with this "straight translation" version, but perhaps there's another way to go.

// 1. Normal usage (unchanged)
let x::number

// 2. Something like ":|" can be used to deliminate a statement. Dunno. 
:| interface MyInterface { ... }

// 3. We don't have to do anything special for this. Just use a bit of spacing tricks to squish
// the next "::" into the "implements" pseudo-operator.
class MyClass ::implements:: MyThing { ... }
x ::as:: y

simonbuchan commented 2 years ago

Seems like there's a lot of "I don't use typescript but I want to make things awful for people wo do" people in here.

Let me argue the counter-position: no. The entire point is to carve out a "nice" space for type syntax. If you're being at all compassionate, this would ideally include the huge quantity of existing "javascript with type syntax" out there - overwhelmingly typescript by most measures; also obviously this should not be narrowly tied to typescript's specific syntax.

Granted, I'm a little dubious about the technical details of how to carve out some cases, e.g. type ... in particular, but they are technical details and deliberately not addressed in detail yet, and the goal of explicitly reserving space for types that wouldn't be used anyway because it would collide with typescript is good. Especially as it lowers the barrier to entry for new type systems! They don't need to start with a transformer and only need to parse their specific type syntax.

theScottyJam commented 2 years ago

Seems like there's a lot of "I don't use typescript but I want to make things awful for people wo do" people in here.

Not at all. I know at least one other person here expressed a dislike towards TypeScript, but I personally use TypeScript and love it. I also don't feel like the syntax carve-out being proposed in this thread is "not nice", it's just different than the syntax TypeScript is currently using, with a different set of pros and cons. If you don't particularly like the way this proposed syntax looks, that's fine - this is where, perhaps, we can also discuss a middle-ground, where a generic, flexible syntax is provided to handle arbitrary current and future needs, but more specific syntax is also provided to help with some of the uglier parts, whatever we feel those uglier parts may be.

But, overall, I don't feel like we need to hold tight to the syntax that's already being used in TypeScript, if we can find an alternative syntax that's much simpler and more powerful. Sure, it would make the transition to the JavaScript syntax a bit more bumpy, and I am apologetic to this, but I think in the long run it could be a good thing. (Plus, large complicated proposals with lots of syntax changes typically have a really difficult time getting through).

simonbuchan commented 2 years ago

I would consider more than 1% of typescript (by lines) having to be updated a very sad outcome. Obviously, foo<T>(bar) is a case where it must be adjusted, handling declare is a sandpit that is being punted for now, etc., but so far all the suggestions here are "re-write (automated or not) pretty much literally everything to an alien syntax"... for very dubious benefit?

This proposal is obviously most immediately most useful for typescript users, but even putting on the "I'm inventing a new type system" hat I would far prefer to have more semantic, designed carve-outs than just another syntax for comments. As a note, I expect if that were useful, we would have seen much more uptake on type-systems in comments than pretty much just typescript's JSDoc checking support.

Putting on my "I'm a typed Javascript programmer" hat, I don't want syntax that, bluntly, sucks. Putting @ or :: or whatever in front of everything would suck. It's worth noting here that ActionScript, Typescript and Flow, with similar constraints, came to pretty much identical syntax, despite having quite different actual checking logic and not having all that much deliberate attempt to converge, AFAIK. The existing general syntax seems to be just the natural option for adding types to all the places you want to put types given Javascript's existing syntax.

theScottyJam commented 2 years ago

for very dubious benefit?

Perhaps, let me try and expound more on the benefits I see here.

Carving out syntax is expensive.

Carving out as much syntax as this proposal is hoping to do isn't an easy task. There's only limited syntax space available to JavaScript, so new syntax has to pass a very high standard before it gets added in. Each proposal that creates syntactic changes tends to undergo a lot of bike-shedding in order to find a new syntax that doesn't conflict with any existing JavaScript (it's not like JavaScript can just reserve new keywords and what-not, so sometimes the new syntax ends up a little ugly or quarky just to preserve backwards compatibility). Each new bit of syntax we reserve for a feature makes that sort of syntax forever our of our reach in future proposals.

You can see this effect happening even in this thread. I tried to use the @ token, because I thought it would work best, but was told the decorator proposal was far too late in the proposal process for me to use it. Since most other ASCII characters were already being used for other purposes, this leaves us having to choose something that's arguably less-than-ideal to make this idea work.

JavaScript has already used up a fair amount of syntax space, which has created a lot of difficulties (and ugliness) when it comes to adding new syntax. Now, what happens when we bring in a proposal this big, with all of the syntax changes it wants to do? That's a lot of syntax space that'll forever be out of our reach for the future. And, what's worse, people will keep requesting new syntax carve-outs for these type comments as type-safe languages continue to comes out with new inovations that need new syntax.

The EcmaScript proposal process is slow

This point is also argued in the README, and it's the reason why they're avoiding spec-ing the specific details of what the type syntax would look like. They want to be able to rapidly innovate and come up with new type-related syntax, without having to go through a proposal process to implement these syntax ideas. What's being proposed here will let TypeScript rapidly innovate on a much wider range of syntax ideas without having to go through TC39.

EcmaScript proposals won't implement all of the syntax that TypeScript wants/needs.

Because of point 1, and the higher learning curve associated with new syntax, EcmaScript sets a really high bar for new syntax. This means there's simply going to be a fair amount of syntax from TypeScript that will never be able to make it into the language. For example, I would be surprised if a proposal to add access-modifier type-comments into the language ever makes it through (considering the language already have "private", and decorators might be able to help implmement some of the other access modifiers). Syntax features that are specific to a single type-safe language would also have a difficult time getting through the proposal process, because this syntax is meant to be usable by any type-interpretor. This means, if the proposal continues in its current direction, people will simply have to choose between using a less-powerful version of TypeScript (TypeScript-in-JavaScript), or using the whole thing and having a compile step. What's being proposed in this thread would instead make the choice be between having a fully-capable TypeScript-in-JavaScript but at the cost of using syntax that's a big uglier.

To address some of your other points:

but so far all the suggestions here are "re-write (automated or not) pretty much literally everything to an alien syntax"

Yes, and you're correct that this is certainly a downside that we need to weigh in, so I hope I don't downplay it too much. But, if in the long-term this means TypeScript is able to innovate faster and provide new syntax whenever they want, I'm ok with having to do a migration. And, it's also good to realize that migration is not forced - some people might just choose to strick with the current build tooling they have, because "why fix what aint broken", especially if TypeScript plans to continue supporting the fully-compiled version of itself. Even still, yes, I do recognize that this is going to be a pain point, probably the biggest one when it comes to this idea.

As a note, I expect if that were useful, we would have seen much more uptake on type-systems in comments than pretty much just typescript's JSDoc checking support.

I'm not sure this is really a fair comparison. I mean, let's compare them.

// js-docs
/**
 * @param {string}  p1
 * @param {string} [p2]
 * @param {string} [p3]
 * @param {string} [p4="test"]
 * @return {string}
 */
function stringsStringStrings(p1, p2, p3, p4 = "test") {
    // TODO
}

// TypeScript
function stringsStringStrings(p1: string, p2?: string, p3?: string, p4 = "test"): string {
    // TODO
}

// This thread's original idea
function stringsStringStrings(p1 @string, p2 @?:string, p3 @?:string, p4 = "test") @string {
    // TODO
}

Sure, the TypeScript syntax looks the cleanest here, but this thread's version isn't that much worse. I'm not a huge fan of how this idea deals with the optional parameters, so perhaps that's one of the middle-ground areas that we add extra, explicit syntax to help out. If we do that, then this thread's syntax would be just as nice as TypeScript's current syntax. Either way, both of these are much, much better than the js-doc version.

It's worth noting here that ActionScript, Typescript and Flow, with similar constraints, came to pretty much identical syntax, despite having quite different actual checking logic and not having all that much deliberate attempt to converge, AFAIK.

I sort of doubt that this wasn't deliberate. Language are often copying-pasting syntax from other languages to give them a more familiar feel. That's why JavaScript's syntax looks so much like Java, despite having some radically different ideas under the hood. From what I understand, it's syntax was originally supposed to be very different, but was changed so developers would feel more at home when using it.

So, likewise, it's certainly a downside that the syntax proposed here will be pretty different from what anyone is used to, which would in turn increase the learning curve for these features. But, it's also just a different (and slightly uglier) skin for the same feature set, the specific details of how the syntax works/looks is often considered to be one of the least important things about a feature, yet, one of the most discussed items (hence, the origin of the term "bike-shedding" - everyone focuses on the bikeshed for a new building). My hope is that the benefits of a flexible syntax will outweight the ugliness of this syntax. Perhaps it doesn't, and you're right, and it would be better to carve out individual features than to provide a general-purpose, flexible syntax. I don't know. What I do know is that the amount of syntax being proposed by this proposal is a bit scary, and I'm not confident that a proposal of this size will ever be able to reach the end of the proposal process unless it "loses some weight" somehow.

Nixinova commented 2 years ago

I agree with using something else instead of :; I think a tilde ~ would be best. Porting TypeScript's syntax verbatim would cause more issues than it solves - most programmers reading something like let foo:number; would assume you would not be able to put a number in it and it seems very easy to forget that it's actually just a suggestion. Something like let foo ~ number; would be more obvious in what's actually happening with tilde implying "this should maybe be a number". Different syntax would also allow actual enforced type checking to be added to JS in the future if that is wanted.

// e.g.
function foo(bar ~ number, baz ~ string?) ~ void {}

simonbuchan commented 2 years ago

@theScottyJam Ok, I can't respond to all this on a phone at midnight, so please forgive my selective, abbreviated reply!

The keyword situation is not actually that bad: there are quite a few unused reserved words already. Further, you don't actually need to have a unique keyword, just syntax that wasn't valid beforehand. Conveniently, "identifier identifier" is never valid JS, so you have free reign to introduce any keyword always immediately followed by an identifier, like type Foo. There's similar cases for most everything added by typescript, and the other JS extending languages: otherwise they wouldn't be able to add it because it already meant something!

Yes, technically these carve-outs would mean it couldn't be used for something else. But seriously, what is interface Foo {} ever going to mean if it's not a type declaration? It sucks when a bad early decision means you can't add a feature, but that doesn't mean never add anything! Further, ECMA would already very strongly avoid colliding with existing syntax in Typescript at minimum, even if it technically wouldn't break deployed content it would be mean to make all the Typescript users churn for no reason (to a lesser extent, Flow as well)

You have a point with what I'll call the "80% coverage" issue, your example being access modifiers. I actually think this specific case is pretty unimportant: they are actually already reserved, so it would not be a big deal to simply spec them as ignored in some context if needed, and also they are not really that critical given we have real privates now. But the general issue is still relevant, and I think it will mostly be that this proposal will just have to add the difficult to avoid cases somehow: I'm guessing some version of declare myself.

Could you give an example of the proposed syntax with some of typescript's more exciting syntax? Such as type ternary expressions using extends tests? I feel at some point you have to give up and just bracket the whole thing... which might be what this ends up proposing for weird cases like that.

You say you have an issue with the size of this feature: but really it's likely to be tiny in terms of spec impact from what I can tell; add some BNF rules, done. Plenty of tiny features to a user can easily end up with dozens of pages of spec for the abstract machine semantics!

@Nixinova "someone might not understand this" and it's close friend "someone might misuse this" are both incredibly lame objections: they can be leveled at literally everything. You need to show that it is sufficiently likely to cause an actual problem, and that there is a solution that doesn't make things worse.

Someone thinking that type annotations will be checked is a one time, minimal cost problem that can't really be fixed by different syntax.

ljharb commented 2 years ago

@simonbuchan interface Foo would mean https://github.com/tc39/proposal-first-class-protocols, if it hasn’t already changed to “protocol” to void conceptually colliding with TS. It’s best not to underestimate the potential future uses of syntax.

matthew-dean commented 2 years ago

@theScottyJam So, it occurred to me early this morning that I should probably explain why a simple character sequence like :: would actually work, universally, in these cases, and why you don't need to pair @ with @@ in your example, or :: with :| or :::.

You, in fact, nearly got there with this statement:

We can also add a "@@" syntax, that will cause everything to the end of the line to be ignored, plus, if any opening brackets were found in that line (via "(", "[", "{", or "<"), further content will continue to be ignored until a closing bracket is found.

You don't actually need @@ though to apply special rules.

Say you have this annotation format.

Say an annotation starts with ::. It ends by:

encountering a line ending \n
encountering a statement ending ;
encountering a list separator ,
encountering a JavaScript block end ) or }

The last line is very much like CSS custom properties. Those properties can end automatically with a ; or }, like regular CSS properties, but they don't necessarily end just because ; or } was encountered, because they are block-aware.

Annotations, like CSS custom properties, would have the concept of blocks, which carves out an exception to the above rules. Blocks in this case would be < >, (, ), {, } (I'm not sure [ ] should apply). When an annotation would encounter a top-level block start, it would continue until it has closed all matching blocks.

So, take a multi-line interface:

::interface Person {
    name::string;
    age::number;
}

You don't need anything special here. The annotation contains a top level block, starting with { and it cannot close until it encounters }.

Similarly, you can have multi-line type assignments by wrapping them in parentheses. So, if you have this in TypeScript:

type Animal = Cat
                 | Dog

You could easily manage this in an annotation format like:

::type Animal = (Cat
                 | Dog)

You would have to get clever about some cases in TypeScript, such as return types on functions, because this logic would fail here:

function addChild::<T>(a::Node)::T { /* */ }

The reason it fails is because of ::T at the end of the function, since it's followed by {, which would be considered part of the annotation by the above rules. (@theScottyJam Which is maybe why you leaned towards a different sequence? There are trade-offs to each!)

You would instead need something like:

function addChild::<T>(a::Node)::(T) { /* */ }

The point is you can get very clever with parsing rules, as other languages have demonstrated.

Few more points here:

Obviously :: is just an example. You just need some unique sequence that wouldn't be recognized as valid JavaScript. (@simonbuchan I understand identifier identifier is not valid and therefore could also be used, but then when does an IDE flag an error? It's not that developer friendly to start throwing a lot of invalid-to-JS syntax. IMO, the proposal should pick one new construct.)
There are probably some edge cases I haven't thought of. It's before 6 am and this isn't an actual proposal.
One downside to this is that certain parsing strategies simply won't be able to handle this. So you could have a perfectly working JavaScript parser today that could not be (easily?) adopted to be block-aware. For example, in CSS custom properties, you can have open blocks within your top-level block which never get closed, and that's considered okay. I'm not sure you'd want to be that permissive here (and it's a reason why many, many CSS parsers cannot parse all valid custom property values). Another way to say it: you can write a parser that detects JS comments via regex. But you can't regex block-aware annotations.
IMO, even with #3 being true, it's still easier / more straightforward to write clever parsing rules for essentially one piece of syntax, then what the "types as comments" proposal is doing which is adding special parsing rules to many pieces of syntax to flag/parse/interpret them as annotations. To me, that's a non-starter for the JavaScript language. Parsers should not be treating special word cases like type and interface as comments (or any generic identifier identifier), along with JS-like constructs like :string. I feel this proposal is just far, far too broad in its scope, and "special cases" a lot of TypeScript syntax, without any clear rules / internal logic about why other than "because it's in TypeScript". To me, that's not a sign of a solid language proposal. It makes sense to TypeScript, sure, but it doesn't make sense in a JavaScript language spec.

Another point. Someone could look at:

function myFunction::<T>(a::?string)::(T) { /* */ }

and say, "but it's too noisy". I'd point out that It's still way less verbose than JSDoc, but at least it has clear, simple rules, and if you want noise-less TypeScript, you can use TypeScript! 😄

matthew-dean commented 2 years ago

@theScottyJam

Just to spitball with your original code, using a single character sequence to see if this logic checks out:

let x::string;

function equals(x::number, y::number)::(boolean) {
    return x === y;
}

::interface Person {
    name::string;
    age::number;
}

::type CoolBool = boolean;

class MyClass {
  name::number;
}

// More complex types DON'T need parens because of annotation closing rules
function fn(value::number | string) { ... }

// optional parameters
function fn(value::?number) { ... }

// import/export types
::export type Person { ... }

::import type { Person } from "schema";

// hmm....
import { ::type: Person, aValue } from "...";

// type assertions
const point = JSON.parse(serializedPoint) ::as { x::number, y::number };

// Non-nullable assertions -- oooo this is a hard one
document.getElementById("entry")::(!).innerText = "...";

// Generics
function foo::<T>(x::T) { ... }

// "this" param -- so.... this is somewhat ambiguous... like the import example, the comma ends up ending the annotation and would then be an invalid parameter block, so that might need some special definition of rules?
function sum(::this: SomeType, x::number, y::number) { ... }

// Ambient Declarations
::declare let x::string;

::declare class Foo {
    bar(x::number)::void;
}

// Function overloading... err, also tricky? Block rules would get messy here, unless we do:
::(function foo(x::number)::number)
::(function foo(x::string)::string)
function foo(x::string | number)::(string | number) {
    // ...
}

// Class and Field Modifiers
// So... a note here, assignment requires you to wrap the type, according to the above block-level rules, so hmm.... again, trade-offs
class MyClass {
  ::(protected readonly) x::(number) = 2;
}

// Allowing someone to "implement" an interface
class MyClass ::(implements MyInterface) { ... }

matthew-dean commented 2 years ago

@theScottyJam I guess another way to do this is similar to what I did in another CSS pre-processing language, which is where it looks like you were leaning with single identifiers, but apply block level rules, like the following, using a # (again, not a serious proposal, just an example):

// I don't have to wrap the type this time, because it only does single identifiers or single blocks
let x#string = 'foo';

function equals(x#number, y#number)#boolean {
    return x === y;
}

// Just wrap the thing longer than a single identifier / block, but fugly?
#(interface Person {
    name#string;
    age#number;
})

simonbuchan commented 2 years ago

@ljharb a good example, but I I think it's illustrative that I sincerely think protocol is a better name and syntax for that than interface, even in the universe where no JS+types language exists: JavaScript already has the convention of referring to "protocols" not "interfaces", it still has the ecosystem conflict with webidl interfaces (which are very close to Typescript interfaces), and the existing uses of "interface" in other languages don't behave at all like protocols (I suspect the equivalent features are called typeclass, trait etc, in part specifically to avoid intuitions about interfaces). That said, I expect you were at the TC39 presentation of that proposal - was the keyword issue raised at all?

I do think JavaScript could add a more "interface-ey" use of interface. But I can't really see that not being basically #45 - which is a completely legitimate issue!

@matthew-dean I think you misunderstood what I meant with "identifier identifier"? Or at least the implication. I was saying it's already disallowed by the grammar, so the language can safely (if not trivially) add any meaning it wants to anything that matches that. In particular: "my_hot_new_not_quite_keyword identifier". Editors would be in the same boat they always are with any syntax extension: they can't parse it until they can.

theScottyJam commented 2 years ago

@simonbuchan

By the way, thanks for taking time to engage with me on this, and to help discuss the pros and cons. I think we're both starting to understand each other points of views here, and what the pros/cons are, we are just giving different importance the these different pieces, which is bringing us to different conclusions.

There's similar cases for most everything added by typescript, and the other JS extending languages: otherwise they wouldn't be able to add it because it already meant something!

This is a fair point. I see some discussions about how X can't be done, because it would conflict with TypeScript syntax. So I think the core features of TypeScript will always be carved out and unavailable for use. Though, I also see the occasional comment about how "TypeScript shouldn't have used the X keyword reserved by JavaScript, we're going to use that keyword and let TypeScript deal with the consequences".

Conveniently, "identifier identifier" is never valid JS, so you have free reign to introduce any keyword always immediately followed by an identifier, like type Foo.

This is true, but can also make syntax extra annoying. Just like you don't particularly like the @@ or :: in front of the interface keyword, I can also see people not particularly keep on the idea of using two-word keywords, especially with syntax that has a strong need to be concise. But, yes, this syntax space will always be available.

Could you give an example of the proposed syntax with some of typescript's more exciting syntax? Such as type ternary expressions using extends tests? I feel at some point you have to give up and just bracket the whole thing... which might be what this ends up proposing for weird cases like that.

When it comes to the syntax of the types themselves, what's being proposed here doesn't offer any more flexibility than the current proposal. Eventually, yeah, you would have to just bracket the more complicated types (though I think @matthew-dean is onto an idea that can help avoid that). What I'm thinking about more is the non-type related syntax. You already mentioned the "declare" syntax which they're currently not planing on including in this proposal, and it might not ever get included - you may be forced to use js-docs for those.

Let me also mention some possible future expansions that a type-checker could choose to take in at any point.

// A new @frozen "keyword" that indicates this class's instances
// are meant to be frozen.
@frozen class MyClass {
  ...
}

// C++ has a concept of choosing to inherit, but making all inherited
// properties protected or private. I've never seen another language
// take up this idea, but hey, if someone ever wanted to, they can.
class MyClass @privateExtends: BaseClass { .. }

// Maybe one particular type-checker decides to make it possible to
// put a function's type on the line before where the function is
// defined. I've seen some languages do this, and I love it,
// as it really declutters the declaration.
@@function(@string, @{ x @int, y @int }) @void
function myFunction(name, opts) { ... }

These are the sorts of innovations a type-checker could choose to try out and implement at any arbitrary point in time. TC39 might not be so keen on adding syntax support for functionality like this, as it might still be too controversial for TC39 to bring it into the language as a permanent thing, even though it's considered stable enough for whatever type-checker is wanting to bring it in.

Though, I think you already understand this idea well, with how you dubbed this sort of thing "80% coverage".

theScottyJam commented 2 years ago

@matthew-dean, I think you're onto an interesting idea here. I like how your syntax idea cleanly helps so you don't have to use parentheses in complex types (like x | y), and how it helps with optional parameters.

It does, unfortunately make a handful of syntax choices a bit grosser. Which, perhaps we ought to bring back a second token with different rules to help with those scenarios (though, at the same time, it does look much nicer when it's only a single token).

It does also suffer from these issues:

// These are pretty sweet
function fn(value::number | string) { ... }
function fn(value::?number) { ... }

// But it doesn't work as nicely here:
let x::(number | string) = fn()
let x::(?number) = fn()

// This works ok
const point = JSON.parse(serializedPoint) ::as { x::number, y::number };
// But this isn't quite as nice
const point = JSON.parse(serializedPoint) ::as(number) + getNumb();

// In this example, wouldn't everything after the closing "}" be considered not-a-comment?
::import type { Person } from "schema";
// If not, then I'm not sure I understand how the return-type syntax works:
function fn(x::string)::(string) { ... }

simonbuchan commented 2 years ago

I can also see people not particularly keep on the idea of using two-word keywords,

Not what I meant, see my ninja reply above (it seems I was not very clear!)

In short, only one keyword, but only if it has to be followed by an identifier.

matthew-dean commented 2 years ago

@theScottyJam

In this example, wouldn't everything after the closing "}" be considered not-a-comment?

Yep! That's a mistake. Which is why I gave the caveat that I was writing it very early before being caffeinated lol. You're exactly right that there are trade-offs with trying to do block-level parsing, and even issues with terminating the annotation immediately after a block end character, such as the } in the ::import.

It's kind of a tricky problem, so it does need some thought. What I was hoping to illustrate / say is I generally agree with you that this is the right type of thinking e.g. to create a brief comment/annotation system & syntax rather than a broad set of syntax cues for turning things into comments / annotations.

@simonbuchan

I think you misunderstood what I meant with "identifier identifier"? Or at least the implication. I was saying it's already disallowed by the grammar, so the language can safely (if not trivially) add any meaning it wants to anything that matches that.

I still don't really follow. Yes, the language can safely add meaning but... (and maybe this is the part I'm misunderstanding), I wouldn't want a language that just treats any arbitrary sequence of identifiers as valid? I just feel like that becomes really hard to flag what was intentional and what's an input error. Maybe you could put in more examples to illustrate that it's not arbitrary?

@theScottyJam Maybe there's something to Microsoft combining token-level annotation syntax with meaningful comment syntax, like:

Say we adopted your rules where you can have an identifier or block (and a few other characters like ? and !?) following :: (I'm still assuming @ is a non-starter because decorators.), but it doesn't "continue". So then you get syntax sorta like:

let x::string = 'foo';

function equals(x::number, y::number)::boolean {
    return x === y;
}

// Just adopt a particular prefix in a regular comment? TypeScript could do this today
/*@
  interface Person {
    name::string;
    age::number;
  }
*/

let x::(number | string) = fn()
let x::?number = fn()

const point = JSON.parse(serializedPoint)::(as { x::number, y::number });
// How I would treat this:
const point = JSON.parse(serializedPoint)::(as number) + getNumb();

/*@ import type { Person } from "schema"; */
function fn(x::string)::string { ... }

// Non-nullable assertions (if you disallowed the period at the "root-level", it would work?
document.getElementById("entry")::!.innerText = "...";

// Function overloading -- more straightforward
/*@
type StringOrNumber = string | number

function foo(x::number)::number
function foo(x::string)::string)
*/
function foo(x::StringOrNumber)::StringOrNumber {
    // ...
}

So, this way, you're still adding a micro-annotation syntax, but it's much more conservative. And TypeScript already does to some degree with "special comments" like // @ts-check.

So then you could put your large TypeScript-y "definition blocks" in special comment syntax that only that parser would interpret, which would be valid JS comments today. And then you are simply adding a micro-annotation syntax with some rules for auto-ending the annotation. That should cover most, if not all TS-y things in this proposal?

simonbuchan commented 2 years ago

@matthew-dean ok, so there's no currently valid JS that contains ::, so you can add a meaning for that. Likewise, there's no valid JS that has type Foo, so you can add a meaning to that. And the same for other sequence that would currently parse as two identifiers. The point is the keyword issue is really not that big a problem.

matthew-dean commented 2 years ago

@simonbuchan I get that, but a sequence like :: is very different from a sequence like type Foo, both semantically and pragmatically. You're comparing two specific points in Unicode, representing a generic start of a sequence vs a sequence of tokens of indeterminate length. In parsing terms, if you said, "any two sequential identifiers makes the first character of the identifier the start of a comment", then you don't know what type is until you get to an indeterminate number of white-space characters and then later encounter Foo, at which point type could be assigned / grouped to a different expression, retro-actively.

If, alternatively, you're saying, "no, not any arbitrary identifiers, but specifically 'type'," then you're asking less of the parser, but you're asking much more of the language, to reserve another keyword and to have particular parsing semantics for it. (And this proposal asks for more than just type.) And even if you're okay with the "ask", this proposal claims it's sort of a "universal", when specifically asking for type is very much a TypeScript-specific ask (even though another language might also use type), for another language that is not TypeScript. So, if that was the ask, then as a language designer, I would reject that outright, unless it could be demonstrated that type or interface are words that are worth reserving in JavaScript as special user space keywords.

So either interpretation of type Foo is either asking a lot from parsers, users, and potentially impacting error-checking in IDEs, or impacting the spec in a strange specific way, which is much greater than the impact of ::, even though, as you say, "there's no currently valid JS" for either. Or a simpler way to say it is that type Foo has side effects that :: does not have.

matthew-dean commented 2 years ago

We should note that the proposal actually says that this direction proposed in this thread may be the way to go:

There might be another direction in which this proposal expands comment syntax just to support modifiers like these, which lead with a specific sigil.

class Point {
  %public %readonly x: number

  @@public @@readonly x: number
}

I'm not particularly a fan of % or @@ for how noisy they are (let x@@number = 0?), but by the same logic, the above could be re-written as:

class Point {
  ::public ::readonly x::number
}

...which actually more accurately reflects a "specific sigil". (A single one vs both @@ and :)

I'll also note (not sure I made it clear before), that my use of :: is not arbitrary, and is also lifted from the proposal:

// Types as Comments - example syntax solution
add::<number>(4, 5)
new Point::<bigint>(4n, 5n)

So, I feel like the proposal authors had the right idea here for annotating types, but probably didn't realize the same argument could be made elsewhere or generalized to a more general and useful annotation syntax (beyond the narrow ::<> use case, which again, is a very, very specific ask, when the ask should be more generalized).

simonbuchan commented 2 years ago

@matthew-dean this ... just isn't an issue. Due exactly the above sort of example pretty much everyone tokenizes separately to parsing (and the people who don't are using even more expressive systems like PEG), and the parser is perfectly capable of asking if the identifier type is followed by another identifier, and if so treating it as if it were a keyword. It wouldn't even be the first case of identifiers being keywords based on context: yield and await are keywords depending on their containing function declaration, async is a keyword only when followed by function or (, or, along with get and set in class bodies or object literals when followed by identifiers, not to mention every keyword is an identifier in a property context (object literal or property access)

jethrolarson commented 2 years ago

👏 I really like this proposal! It really separates the details of the type-system which would not be part of JS, and grants a place for an arbitrary type-system to exist on-top of JS, whether that's TS, Flow, or some future thing(like my proposal :wink: #84 ).

In particular using something like @@(whatever type stuff) to intersperse complex types in whatever way the 3rd party wants is cool.

Only thing this is missing is some kind of pragma to say what system is being used. Maybe just some convention like

@@(using typescript)

or similar is enough. Though with bundling and such, where does such designation end?

I think the specifics of what digraph to use (::, @@) may just be a matter of taste and running a poll or something could decide.

simonbuchan commented 2 years ago

I think the specifics of what digraph to use (::, @@) may just be a matter of taste and running a poll or something could decide.

Here's my real problem with this sort of suggestion: how do you feel about /*@ and */?

theScottyJam commented 2 years ago

@jethrolarson

Only thing this is missing is some kind of pragma to say what system is being used. Maybe just some convention like

That sort of thing is actually being discussed in issue #36

(like my proposal :wink: https://github.com/giltayar/proposal-types-as-comments/issues/84 ).

I actually recently noticed that idea, and would love it if future type systems were to go in that direction (so, my hope is we can come up with a annotation system that's flexible enough to support that style of typing as well). I hate how cluttered syntax gets when types are intermixed with actual code. We would still need some sort of way to flexibly handle things like "implements", "as", etc though.

I do sometimes prefer putting the type annotation for a single variable or a property on the same line as the property itself, but I don't have a strong preference here.

@simonbuchan

Here's my real problem with this sort of suggestion: how do you feel about /@ and /?

I'm fine with it. It's a little awkward to use, because:

// Those /* and */ take up a lot of extra verticle space, especially when dealing with smaller interfaces.
/*@
interface MyInterface {
  ...
}
*/

// Theoretically, you could try removing some whitespace like this, but that feels awkward as well.
/*@interface MyInterface {
  ...
}*/

So, the most difficult part of the /*@ ... */ suggestion is the fact that you need the closing */, which makes it difficult to make the syntax more concise, compared to a simple prefix of something like ::. So, my preference would be to have some sort of syntactic prefix for this sort of thing, but it's not a strong concern for me.

I think the most important thing is to find a way to provide type syntax and/or operators inline, in the middle of expressions, so you can use things like as, implements, etc. I'm ok with the syntax being a bit heavier for big chunks of type-only logic, even if it might mean using block comments to handle it, as long as we can find something suitable for the smaller, in-line chunks that isn't overly heavy to use.

jethrolarson commented 2 years ago

@simonbuchan It's nice that there is a convention but having there be a annotation standard that is terse would also be nice.

I did a quick and dirty branch which removes all the typescript/flow specific requirements and specifies a pragma https://github.com/jethrolarson/proposal-types-as-comments/blob/generic_annotations/README.md

matthew-dean I realize in post that this is almost exactly the same as yours but you define more cases :doh:

simonbuchan commented 2 years ago

So, the most difficult part of the /*@ ... */ suggestion is the fact that you need the closing */, which makes it difficult to make the syntax more concise, compared to a simple prefix of something like ::. So, my preference would be to have some sort of syntactic prefix for this sort of thing, but it's not a strong concern for me.

Well here's the wonderful thing: you can also already use //@!

Here's the point: despite the proposal name adding a new comment syntax is not really that useful. The value is that you have a wide syntax space you can squeeze a type system into with it feeling natural, especially the one that already has a huge amount of already existing code.

So far, in not seeing anything that would actually get used by anybody: if they were we would already see comment driven type systems (that weren't basically an upgrade path to Typescript)

matthew-dean commented 2 years ago

@simonbuchan I guess it depends on how you define "an issue". 🤷‍♂️ I'm not saying you can't parse it. Of course you can. But you made a separate assertion, which is that type Foo is the same ask of the language as :: and it just... isn't? And it does look like you're saying now that type and interface should be keywords reserved in JavaScript just for use by TypeScript, and... yeah, there are other reserved keywords that continue to go unused by JS, so one could argue, what's another two? But it just feels a little "staple-y" of TypeScript syntax onto JavaScript.

matthew-dean commented 2 years ago

So far, in not seeing anything that would actually get used by anybody: if they were we would already see comment driven type systems (that weren't basically an upgrade path to Typescript)

It's a fair criticism that the /*@ */ pairing is awkward. That was just an example to how you could handle multi-line pieces without so much syntax injection.

I guess, on the flip side, what I'm not seeing is any articulation of why JavaScript should add this; that is, why are all the syntax additions in the original proposal a benefit to the JavaScript ecosystem, for people not using TypeScript of another type system? Small, generic, annotation syntax has some documentation benefit. Specific TypeScript features / keywords, in my view, hold no benefit except to TypeScript. And shouldn't the burden of proof of benefit be on the proposal?

matthew-dean commented 2 years ago

@theScottyJam

So, the most difficult part of the /@ ... / suggestion is the fact that you need the closing */, which makes it difficult to make the syntax more concise, compared to a simple prefix of something like ::. So, my preference would be to have some sort of syntactic prefix for this sort of thing, but it's not a strong concern for me.

Yeah, meaning within block comments is awkward. And @simonbuchan makes an interesting argument that if this would have worked, TS or someone else would have done this already? So maybe there really does need to be two additional forms of "ignore this" syntax?

Again, more spit-balling

#define type Animal = Dog | Cat
#define interface Foo {
  a: string;
  b: string;
}

// there are soooo many specs clamoring for # and @ in JS-land right now lol, but... ?
#type Animal = Dog | Cat
#interface Foo {
  a: string;
  b: string;
}

// hmm, maybe this isn't so bad? It starts to look unwieldy though
:::type Animal = Dog | Cat
:::interface Foo {
  a: string;
  b: string;
}

I think the most important thing is to find a way to provide type syntax and/or operators inline, in the middle of expressions, so you can use things like as, implements, etc. I'm ok with the syntax being a bit heavier for big chunks of type-only logic, even if it might mean using block comments to handle it, as long as we can find something suitable for the smaller, in-line chunks that isn't overly heavy to use.

Yeah, I like :: for inline, but it's somewhat awkward for the multi-line statements. It could be that there's a solution by re-structuring how certain things in TypeScript are defined, or where they're positioned. (But then, would TS people switch to TS-y JS? And maybe that's the bind the proposal authors found -- the more you deviate too much from existing TypeScript, the steeper the learning curve between the two and the less chance people would adopt it as a TS alternative.)

matthew-dean commented 2 years ago

https://dotnetcrunch.in/comments-in-different-programming-languages/

Not a lot of variation here, but of note, PHP uses 3 comment forms, so it isn't completely unusual to have more than just 2 🤷‍♂️

simonbuchan commented 2 years ago

which is that type Foo is the same ask of the language as :: and it just... isn't?

In the sense that at the moment :: would be parsed as two colon tokens, and give a parser error, and type Foo would currently be parsed as two identifier tokens and give a parser error, it is in fact pretty much identical?

Have you actually written parsers before? Like, more than a flex / yacc toy language back in college? They are not deep dark magic, this is roughly just an extra branch along the lines of (if (id.text === "type" && peek().type === "identifier") parseType();. It's far more trivial than lots of existing syntax in javascript (as described), so I'm just confused as to why you think this is an issue at all.

And it does look like you're saying now that type and interface should be keywords reserved in JavaScript just for use by TypeScript...

Uh, no? It's for them to be ignored, so that typescript and flow and any other future JS type systems can fit themselves in there. This is essentially already the case for typescript, as ECMA will want to avoid screwing over the large existing user base, so if you spec out what is actually reserved explicitly language authors, typescript and others can be confident they won't get eaten by some future JS change and can be a lot more confident about what they add as syntax.

Not a lot of variation here, but of note, PHP uses 3 comment forms, so it isn't completely unusual to have more than just 2 🤷‍♂️

The point is not "how many comment forms" - the point is /*@ */ and //@ with the semantics of being ignored are already in the language. You could have used them for a type system for the last 25 years.

The value is not adding more comments. It's making type systems less sucky to use. So far, this looks way more sucky to use than a transpiler, and I would never use it, and I doubt anyone else would either.

theScottyJam commented 2 years ago

Ok, I've had some time to think about this some more, and I think I've found some ways to remove some of the ugliness of this proposal.

I'm going to introduce three types of inline comments.

If a line starts with :: (or maybe a single :, we can decide later), then everything after the line will be ignored, along with any groupings that may have opened in that line (This is the same as the original @@).
Immediately after a binding, you can place a : and then some type information. This will basically follow the same logic that the main proposal's README is going for. Whatever rules the main proposal decides to adopt for the : syntax, we can also adopt in this simplified proposal (so, we shouldn't have to worry about using parentheses to group together more complicated types like union types, unless the main proposal ends up struggling with this as well). As a bonus, the final syntax will end up looking a lot more like TypeScript, which will preserve some backwards compatibility (which is something I know @simonbuchan would be a fan of).
We can use the \ character, like @lmcarreiro suggested, for commenting out a single token (or a grouping, if the token being commented out is an opening bracket). This is the same as the original @ token.

Point 2 should handle most of the complexity related to inline-types, because we're adding explicit support for it, while points 1 and 3 will provide flexibility to add everything else a type-system might ever wish to add.

With this rule-set in mind, lets look at a rewrite of the original code snippet.

let x: string;

function equals(x: number, y: number): boolean {
    return x === y;
}

::interface Person {
    name: string;
    age: number;
}

::type CoolBool = boolean;

class MyClass {
  name: number;
}

function fn(value: number | string) { ... }

// optional parameters
function fn(value:? number) { ... }

// import/export types
::export interface Person { ... }

::import { Person } from "schema";

// The first "\" causes "type" to be ignored, the second causes "Person" to be ignored.
import { \type\ Person, aValue } from "...";
// After the ignored content is removed, this line will look like this: import { , aValue } from "...";

// type assertions
const point = JSON.parse(serializedPoint) \as\ { x: number, y: number };

// Non-nullable assertions
document.getElementById("entry")\!.innerText = "...";

// Generics
function foo\<T>(x: T) { ... }

// "this" param
\this\SomeType function sum(x: number, y: number) { ... }

// Ambient Declarations (currently being considered to not be included to keep the proposal smaller, but now it can be trivially added)
::declare let x: string;

::declare class Foo {
    bar(x: number): void;
}

// Likewise, function overloading might not get added to this proposal as it currently stands, but it's trivial to add with this idea
::function foo(x: number): number;
::function foo(x: string): string;
function foo(x: string | number): string | number {
    ...
}

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  \protected \readonly x: number = 2;
}

// Allowing someone to "implement" an interface
class MyClass \implements\ MyInterface { ... }

// Part of the reason I decided to go this route, is because I noticed how difficult it is to parse
// array (especially when they're readonly) and generic types with the old rules.
// Borrowing from the original proposal's rules makes this much nicer.
let x: readonly number[] = []
let y: Map<number, number> = new Map()

This is looking pretty good now :). The only extra tax we're receiving here is the :: at the beginning of larger chunks of type-related syntax (which isn't a big cost), and the \ we have to put around type-operators, which yes, looks a bit odd, but it's not the end of the world.

theScottyJam commented 2 years ago

As for this point:

The point is not "how many comment forms" - the point is /@ / and //@ with the semantics of being ignored are already in the language. You could have used them for a type system for the last 25 years. The value is not adding more comments. It's making type systems less sucky to use. So far, this looks way more sucky to use than a transpiler, and I would never use it, and I doubt anyone else would either.

It's all a gradient. As I mentioned previously, I think the greatest thing typed-variants of JavaScript have to offer is the ability to easily add inline types and operators. Super simple syntax for everything else is more of a nice-to-have than a need-to-have. Using /*@ ... */ syntax does a really poor job at supporting this main feature of inline-type-and-operators, even worse than what's being proposed here.

I'm going to compoare different syntax options, from worst to best. The absolute worst (1) is using straight comments. Next (2) was what I had originally proposed in this thread. After that (3) is what I proposed in my last comment (which I think has some potential). Best (4) is current TypeScript syntax.

function myFunction(x /*@number*/, y /*@{ x: number }*/ = defaults) /*@number*/ { ... } // 1
function myFunction(x @number, y @{ x @number } = defaults) @number { ... } // 2
function myFunction(x: number, y: { x: number } = defaults): number { ... } // 3
function myFunction(x: number, y: { x: number } = defaults): number { ... } // 4

In this specific scenario, options 2 through 4 are equally verbose, option 2 just uses an @ character instead of : to begin the type-related logic. Option 1 is far more verbose and harder to read. (Of course, had I used optional parameters and what not in this example, option 2 would certainly loose some of it's elagance, but it would still be ahead of option 1). I think this is why a comment-only system hasn't gained much popularity in JavaScript today, because JavaScript doesn't provide non-verbose comment syntax to be able to do what we need it to do.

We can certainly do other comparisons as well to see how each of these syntaxes compare with each other. Again, you'll find the comment-based syntax is consistently the most verbose compared to the alternatives (except for the "declare" example, in which it performs about the same as the rest).

return fn() /*@as unknown*/; // 1
return fn() @as: unknown; // 2
return fn() \as\ unknown; // 3
return fn() as unknown; // 4

//@declare let x: number; // 1
@@declare let x: number; // 2
::declare let x: number; // 3
declare let x: number; // 4

// 1
/*@
interface MyInterface {
  ...
}
*/

// 2
@@interface MyInterface {
  ...
}

// 3
::interface MyInterface {
  ...
}

// 4
interface MyInterface {
  ...
}

I'm not sure yet what you feel about the lastest iteration of this idea I presented in this last comment, but note that the extra verbosity introduced by this latest idea is very minor. There's only two things: Sometimes you have to start a line with ::, and sometimes you have to put a \ next to a TypeScript operator. So, you type ::interface instead of interface, and you type \as\ instead of as. The extra verbosity here is extremly low. It's not zero, and perhaps it looks a little foreign and odd, and maybe it takes a bit to get used to the way it looks, but it does the job, and IMO it does it well. But, If you think the extra :: and \ are still a feature killer, that's fine.

matthew-dean commented 2 years ago

@simonbuchan

Have you actually written parsers before? Like, more than a flex / yacc toy language back in college?

🙄

I'm just confused as to why you think this is an issue at all.

You're confused because you keep thinking I'm talking about parsing / parsing difficulty. I'm not.

And it does look like you're saying now that type and interface should be keywords reserved in JavaScript just for use by TypeScript

Uh, no? It's for them to be ignored, so that typescript and flow and any other future JS type systems can fit themselves in there.

That's a distinction without a difference. In order for those specific words / structures to be ignored, they need to be defined and reserved for ignoring purposes. There's no difference between "ignoring" and "reserving for ignoring". Once they are flagged to be ignored, they can no longer be used to one day have a specific meaning, making them reserved.

The value is not adding more comments. It's making type systems less sucky to use.

Again, these are semantic distinctions between what you are considering a "comment" and what you are considering "something ignorable". In terms of meaningfulness to the JS interpreter, there's no difference between "ignoring something because it starts with //" and "ignoring something because it starts with type" or "ignoring something because it starts with : after a parameter var". In fact, this proposal is called "types as comments". So, the original proposal is proposing a large variety of new comment forms (essentially every ignorable keyword or ignorable by character because of position is a new type of comment -- in this proposal, a non-nullable assertion ! is a new type of comment i.e. a character sequence that's ignored), and within this thread, we're suggesting paring that back to one or two.

@theScottyJam

I think your latest iteration strikes a good balance. Although I still worry that a simple let x: number could easily trip up a newbie to think that number is somehow meaningful at runtime. I really worry about the confusion introduced to people learning JavaScript. ☹️

I don't know how I feel about \protected or \<T>. JavaScript already has the concept of escaping characters, and even though it exists primarily in strings, that seems like that could also cause confusion. I get it though -- there's not a lot of syntax space to work with here!

matthew-dean commented 2 years ago

From the proposal:

For these reasons, the goal of this proposal is to allow a very large subset of TypeScript syntax to appear as-is in JavaScript source files, interpreted as comments.

So, the number of comment forms in JavaScript would go from 2 to:

/* */
//
: string or ?: string
export/import? interface ...
export/import? type ...
type in import
as [x]
non-nullable assertions !
Generic declarations <T>
Generic invocations ::<T>
this parameters

So, this proposal, by it's own definition, is asking to expand the number of comment forms in JavaScript from 2 to 11, with additional ones up for debate. 9 additional types of comments seems like a big ask for a language that currently has 2. How would you teach that in a JavaScript course, when you get to a section on comments? You would essentially be teaching them TypeScript at that point. Even if you don't teach them what those comment forms do in TypeScript, you would still have to mention that they need to ignore those pieces as it relates to how their programs / scripts work at runtime. That's an incredible addition of a learning curve to learning JavaScript.

11 types of comments just seems like too many (with even more on the table). 🤷‍♂️ The original proposal / ask by @theScottyJam to reduce this number to the most conservative amount is not an unreasonable ask IMO.

ckruppe commented 2 years ago

I know its maybe a little bit offtopic but i was always wondering why the typescript syntax is like it is. The type annotation with : was always a thing that held me back bigtime from using it. So could maybe someone explain to me why no one settles on an well known and widely used syntax like in dart or C or C# and so on?

Why is something like this not an option?

String function stringsStringStrings(String p1, String? p2, String? p3, Number p4 = 5) {
    // TODO
}

I would like to understand what has driven the decision to use the current syntax. For me it felt always off in some wierd way.

Thx and sorry for hijacking this diskussion. :)

matthew-dean commented 2 years ago

@ckruppe You're not wrong! It's one of the things that's problematic from this proposal, although we haven't really addressed it. key: value is already a construct in JavaScript, so the "assignment" of a type via : is potentially confusing to newbies to JavaScript, whereas, you're right, it could have been inserted as more of a "flag" for a type, which would potentially be less confusing if repeated in this proposal, or make some variations on the TS form a littler easier, like:

~string function stringsStringStrings(~string p1, ~string? p2, ~string? p3, ~number p4 = 5) {
    // TODO
}

I'm sure a language designer for TypeScript might be able to answer that better. My guess is it's a case of paving the cowpaths i.e. Flow already used the : type assignment form. So then you'd have to ask the Flow team why they chose that?

ckruppe commented 2 years ago

@matthew-dean thx for your explanation i was always curious about this decision but you may be right. Its plausible to stem from flow in this context.

I was wondering of there is maybe some language requirement that did made it impossible or something. If not i think this proposal would be the chance in my opinion. But maybe the typescript familiarity is one of the selling points for this proposal. So i'm not sure. I personally would like to see an more "native" syntax for this proposal. :)

simonbuchan commented 2 years ago

@ckruppe Nearly every new language uses trailing types now. There's several reasons, parsing ambiguities that are even a problem for C, which also makes it really hard to add new kinds of types (try to imagine a leading union type, for example), but they also just kind of suck,when the type gets really long. Further, most of the time, they're inferred, so making the fact you're declaring a variable is much more relevant and should be the default mindset, rather than what type the variable is. In many languages, they're often actually a type you literally can't even write!

There's Typescript specific reasons. One is that you should always see the JavaScript it will generate, so it should ideally only be removing tokens when it transpiles. In reverse you should be able to take JavaScript, make it Typescript, and just add types where needed, which means the syntax should not require the types to be present, which is tricky in parameters, for example. Also, is it a var, let or const?

Basically, keep in mind that the Typescript team were taken from the designers of C#. Be sure that any differences from that have good reason.

simonbuchan commented 2 years ago

@matthew-dean I'm not asking as an ad-hominum attack: you literally complained about the difficulty in parsing several times and it made no sense to me unless you were only used to the flex/yacc garbage. If this isn't about parsing, what did you even mean with all that talk about how it would be parsed? That feels like goalpost shifting. If you aren't familiar with parsers, then that's fine, it just means I need to explain more.

The reason I distinguished these from comments is that comments are tokens. You see some character sequence and you ignore everything up to some other character sequence, and give that as a token to the parser. The difference between a string and a comment is that the parser ignores comments! These syntax extensions are defined on sequences of existing tokens (for the most part), that are currently invalid. So while it's fine at a proposal level to talk about how these are treated as comments, that's talking about semantics, that is, that they have none (to the runtime). When you're talking about actually parsing, and therefore what syntax it should have, there's a huge difference. Primarily, how do you know you've stopped parsing a type! The detail of that is hugely important. Far more than trying to not assign type or : a meaning.

Of course you don't teach type annotations along with comments and as if they were comments. You're reaching. They aren't just another comment to the developer, stop trying to treat them that way and that problem goes away.

If you have problems with the generality of the syntax, that's fine. But in practice, actually writing typed JS, typescript or otherwise (I also have experience in Flow and AS3), you end up needing something at each of those places, and all of those languages did very similar things.

matthew-dean commented 2 years ago

it made no sense to me unless you were only used to the flex/yacc garbage

No. I'm not going to get into it, because it's a useless thing to flex about but the short answer is no. I mentioned it yes, and I'll give you the benefit of the doubt in saying I probably didn't articulate it clearly, but I mentioned it as part of a broader point about language complexity. And when I said the things were not the same, I was talking about it in terms of cognitive complexity (someone learning the language and which things are what), not parsing complexity. It's subjective, of course, as all brains work different. I just consider the ignoring of a specific set of keywords and their subsequent definitions "as comments" to be more cognitively complex than a "comment marker" like /* or ::.

The reason I distinguished these from comments is that comments are tokens. You see some character sequence and you ignore everything up to some other character sequence, and give that as a token to the parser.

Comments can be tokens if you have a tokenization phase in your parser, which not all parsing strategies use.

The difference between a string and a comment is that the parser ignores comments!

Parsers can ignore comments and often do. Sometimes they're collected for output in the AST, such as in Less or TypeScript. Sometimes they are further parsed, such as in TypeScript's case re: JSDoc, where structures are parsed to provide types, do type imports, etc. (or recognize things like // @ts-check)

If you aren't familiar with parsers, then that's fine, it just means I need to explain more.

Same. 😉

Of course you don't teach type annotations along with comments and as if they were comments. You're reaching. They aren't just another comment to the developer, stop trying to treat them that way and that problem goes away.

I think that's a little hand-wavy to the potentiality of the problem. If this were accepted, you could / should separate the problem into traditional comments and "structures added by JavaScript to be treated as comments for the TypeScript team", but you couldn't just ignore it in a JavaScript course, if this were accepted. However you want to define the problem, or teach the problem, it increases the JavaScript learning curve. It's not zero-cost!

But in practice, actually writing typed JS, typescript or otherwise (I also have experience in Flow and AS3), you end up needing something at each of those places, and all of those languages did very similar things.

This is an argument in favor of TypeScript or Flow or AS3. It's not an argument in favor of adding these comments or ignorables or whatever we want to call them to JavaScript. I've seen nothing about how this proposal is, overall, a value-add for JavaScript itself. Which isn't to say it has no value, but I think that's then where there's this pushback in this thread of, "Okay, that being the case, if we consider this to have value, what is the least negative impact on the JavaScript language?"

matthew-dean commented 2 years ago

@simonbuchan

There's several reasons, parsing ambiguities that are even a problem for C, which also makes it really hard to add new kinds of types (try to imagine a leading union type, for example)

the syntax should not require the types to be present, which is tricky in parameters, for example. Also, is it a var, let or const?

Very good points about why TypeScript / Flow / AS3 is the way it is. I think the other thing to mention is that there is prior art in the form of ES4. https://evertpot.com/ecmascript-4-the-missing-version/. So even JavaScript very, very nearly had native types, and this is how they were defined.

Here are some ES4 examples (native types):

function add(a: int, b:int): int {
  return a + b;
}

var a: (number, string)

class Wrapper<T> {
  inner: T
}

function getCell(coords: like { x: int, y: int }) {

}

simonbuchan commented 2 years ago

Note that ES4 was just Adobe (or was it still Macromedia?) trying to push Actionscript 3 as a standard. Thank goodness they failed, AS3 Is awful.

theScottyJam commented 2 years ago

If this were accepted, you could / should separate the problem into traditional comments and "structures added by JavaScript to be treated as comments for the TypeScript team", but you couldn't just ignore it in a JavaScript course, if this were accepted. However you want to define the problem, or teach the problem, it increases the JavaScript learning curve. It's not zero-cost!

At first, I wasn't overly concerned about this actually. If people are able to use TypeScript just fine, despite the fact that they add a number of different types of sytnax to the language that are no-op at runtime, then people should be able to handle it if it became native syntax to JavaScript as well.

But, thinking about it more, the people who use TypeScript are generally people who already have a fairly solid foundation in JavaScript, and usually have a good idea of what is and isn't JavaScript syntax (though, this isn't always the case). When this proposal gets in, we'll start seeing types in more and more places, and beginner programmers will need to learn what actually provides runtime effects and what does not. For example:

function fn(x: Thing) {
  return x.y as AnotherThing
}

Perhaps a beginner JavaScript developer finds the above example in a StackOverflow answer. They know they currently don't care about type syntax, so they do their best to scrub that stuff out before inserting it into their own code.

function fn(x) {
  return x.y as AnotherThing
}

But, it's also reasonable to believe that the as is doing some sort of runtime coercion trick. There's no harm in keeping that in there, but it sure would be confusing when this developer is trying to debug things, and they're struggling to figure out why this "runtime coercion" isn't doing what they expected it to do.

I actually have a friend who learned TypeScript fairly early on. Every once in a while, he needs to come to me and ask if something is TypeScript syntax or JavaScript, because he can't tell the difference, so I guess this is a real concern. Perhaps a more minor one IMO compared to some of the other pros and cons listed, but I think it's still valid.

matthew-dean commented 2 years ago

@theScottyJam

But, it's also reasonable to believe that the as is doing some sort of runtime coercion trick. There's no harm in keeping that in there, but it sure would be confusing when this developer is trying to debug things, and they're struggling to figure out why this "runtime coercion" isn't doing what they expected it to do.

Yes exactly. If you come at this problem already knowing TypeScript and using it, these things seem like small potatoes. It's hard to put ourselves in the shoes of learning JavaScript for the first time, but I guarantee as AnotherThing would make learning that function declaration harder for someone knew.