Open samwgoldman opened 2 years ago
TC39 is extremely unlikely to accept any syntax which requires unbounded backtracking.
Early discussion with browser engine engineers seemed optimistic about being able to parse and ignore these comments efficiently. You'll also notice not all of TypeScript/flow syntax is supported in this proposal.
I too have found it confusing that this proposal is called "comments". In every language I'm aware of, including JavaScript, a comment consists of a start character sequence, an end character sequence, with anything in the middle being allowed.
I'm assuming that is not the case here, and that only valid identifier chars are allowed within the "comment" section. For example, is this allowed?
function(thing: a[2]32]*@@!) {
}
?
Assuming this is not allowed, I think calling these comments is both inaccurate and confusing. Maybe "unchecked type annotations"?
@matthewp You may want to weigh in on #78.
I maintain Sublime Text's JavaScript, TypeScript, JSX, and TSX highlighting, and I can confirm that highlighting TypeScript requires backtracking (or the moral equivalent). ECMAScript syntax is mostly deterministic context-free, except that arrow functions are nondeterministic context-free. TypeScript's syntax is much more complicated and much more nondeterministic, and it has some gnarly (and undocumented) corner cases.
That said, the thing about highlighting is that you need to know what the token is when you get to it. The ECMAScript grammar avoids explicit nondeterminism via covering productions. This doesn't help a real-time highlighter very much, but it can suffice for other applications. I don't know to what extent this would work for TypeScript. TypeScript currently has no spec; its syntax is implementation-defined. I assume that the implementation is efficient, and that by extension the syntax can be parsed efficiently. But that efficiency might not necessarily survive the grammar changes necessary to make TypeScript a superset of ECMAScript.
The addition of ::
in the proposal grammar is encouraging. That would be much easier to parse, and it shows that the proposal authors are willing to deviate from TypeScript where appropriate.
For example, is this allowed?
function(thing: a[2]32]*@@!) { }
Hi @matthewp! With the current proposed grammar (very much subject to change) that would not be syntactically valid, as you assumed it wouldn't be.
function(thing: a[2]32]*@@!) {
// a[2] = arrayType
// 32 = literalType
A literalType
can't directly follow an arrayType
, so the parser would stop attempting to parse a type
at the 3
.
In order to skip over types, engines will need to parse them. Compared to existing comments, the proposed types-as-comments will require more work to find out where they end. Both TypeScript and Flow parsers use backtracking to handle certain parses.
If so, would there be challenges updating the ECMAScript grammar to support these cases? Would there be challenges for engines who might prefer simpler parsing rules?