Closed xunilrj closed 2 years ago
Here's the link to the spec: https://facebook.github.io/jsx/
Thanks for writing this up.
Making the token source lazy is something that we probably want to do regardless if JSX depends on it because it is a blocker for fixing some of the TS conformance bugs (A< <TypeArgs>() => string>
is valid, multiple regexp that aren't correctly parsed).
There are also a few places where we must change how we parse TypeScript if JSX is enabled (... arrow functions and casts).
If we are creating a new lexer, checkout https://github.com/mozilla-spidermonkey/jsparagus/blob/master/crates/parser/src/lexer.rs for a more rusty approach 😄
I start working on a proposal for the JSX AST nodes.
I followed the spec linked by @ematipico
// For embedding into expressions
JsxElementExpression =
element: JsxAnyElement
JsxFragmentExpression =
fragment: JsxFragment
JsAnyExpression =
...
| JsxElementExpression
| JsxFragmentExpression
// ==================================
// Elements
// ==================================
JsxAnyElement =
JsxElement
| JsxSelfClosingElement
// <a>...</a>
JsxElement =
opening_element: JsxOpeningElement // Rome classic inlines the opening and closing element. May make it hard distinguishing which token belongs where?
children: JsxChildList
closing_element: JsxClosingElement
JsxOpeningElement =
'<'
name: JsxAnyElementName
type_arguments: TsTypeArguments?
attributes: JsxAttributeList
'>'
JsxClosingElement =
'</'
name: JsxAnyElementName
'>'
// <a />
JsxSelfClosingElement =
'<'
name: JsxAnyElementName
type_arguments: TsTypeArguments?
attributes: JsxAttributeList
'/>'
JsxFragment =
'<>'
children: JsxChildList
'</>'
JsxAnyElementName =
JsxReferenceIdentifier
| JsxMemberExpression // (JsxMemberName?)
| JsxNamespaceName
JsxAnyIdentifier = JsxReferenceIdentifier | JsxMemberExpression
// <a.test>
JsxMemberExpression =
object: JsxAnyIdentifier
'.'
member: JsName
// ==================================
// Attributes
// ==================================
JsxAnyAttribute =
JsxSpreadAttribute
| JsxAttribute
JsxAttribute =
'name': JsxAnyAttributeName
'initializer': JsxAttributeInitializerClause?
JsxAttributeInitializerClause =
'='
value: JsxAnyAttributeValue
JsxAnyAttributeValue =
JsxElement
| JsxSelfClosingElement
| JsxFragment
| JsxStringLiteral
| JsxExpressionAttributeValue
// <a b={expr} />
JsxExpressionAttributeValue =
'{'
expression: JsAnyExpression
'}'
// <a {...b} />
// ^^^^^^
JsxSpreadAttribute =
'{'
'...'
argument: JsAnyExpression // parse_assignment_expression_or_higher
'}'
JsxAttributeList = JsxAnyAttribute*
// `a:b`= or `a`
JsxAnyAttributeName =
JsxNamespaceName
| JsxName
// ==================================
// Children
// ==================================
JsxAnyChild =
JsxText
| JsxElement
| JsxSelfClosingElement
| JsxFragment
| JsxExpressionChild
| JsxSpreadChild
// <a>{...b}</a>
// ^^^^^^
JsxSpreadChild =
'{'
'...'
argument: JsAnyExpression // Assignment expression or higher
'}'
// <a>{b}</a>
// ^^^
// <a>{}</a>
// ^^
JsxExpressionChild =
'{'
expression: JsAnyExpression?
'}'
JsxText = value: 'jsx_text'
JsxChildList = JsxAnyChild*
// ==================================
// Auxilary
// ==================================
// has different semantic than JsReferenceIdentifier, allows for `await`
// but maybe not worth distinguishing?
JsxReferenceIdentifier = value: 'ident'
// <a:test>
JsxNamespaceName =
namespace: JsReferenceIdentifier
':'
name: JsName
// JSX strings don't allow for escape sequences
// Historically, string characters within JSXAttributeValue and JSXText are extended to allow the presence of HTML character references to make copy-pasting between HTML and JSX easier,
// at the cost of not supporting \ EscapeSequence of ECMAScript's StringLiteral. We may revisit this decision in the future.
JsxStringLiteral = value: 'js_string_literal'
Main differences to rome classic
JsxAnyElement
JsxElementExpression
and JsxFragmentExpression
for embedding JSX into expression rather than adding JsxElement
and JsxFragment
to the JsAnyExpression
union. Motivation: consistent *Expression
naming for expressions and adding Expression
to element is weird because they're also allowed inside of JsxContent
JsxExpressionAttributeValue
and JsxSpreadChild
, JsxExpressionChild
vs rome-classic's JsxExpressionContainer
, JsxEmptyExpression
, JsxSpreadExpression
. Removes the need to query the parent to know in which context the expression is used. Main question
JsStringLiteralExpression
or introduce JsxStringLiteral
. I would favour the latter: a) It's not an expression, b) it has very
different constraints. But we'll need helpers to easily be able to transform between the twoJsReferenceIdentifier
or introduce new JsxReferenceIdentifier
. They have some semantical differences which would favour different nodes. But these are only semantic differences that aren't validated in a mutation API anyway and may, thus, not be worth it. The exception is that JsxIdentifiers
allow for dashes. For example, <a-b-c></a-b-c>
is validI think this comment should have been a github discussion, it would have been easier to leave feedback.
Here's some feedback:
// <a.test>
JsxMemberExpression =
object: JsxAnyIdentifier
'.'
member: JsName
Member expressions can be recursive, meaning that we can have something like <a.b.c></a.b.c>
. This mean that object:
should be able to have JsxMemberExpression
too.
But these are only semantic differences that aren't validated in a mutation API anyway and may, thus, not be worth it
Also names with first capital letter are exceptions, e.g. <Aside></Aside>
. Personally, I would prefer a new node because it makes easier to pinpoint these nodes inside analyzers. But I don't have strong opinions :)
// <a>{b}</a>
// ^^^
// <a>{}</a>
// ^^
JsxExpressionChild =
'{'
expression: JsAnyExpression?
'}'
Maybe JsxChildExpression
is better?
JsxMemberExpression // (JsxMemberName?)
Better JsxMemberExpression
, which is inline with the other member expressions.
Reuse JsStringLiteralExpression or introduce JsxStringLiteral
I'd prefer JsxStringLiteral
, it might have some implications in our formatter
Could you handle non-React JSX like SolidJS? SolidJS is gaining popularity, & examples' JSX is very close to React's, but inline CSS has normal (dash-case) property names, & some funky attribute prefixes.
Member expressions can be recursive, meaning that we can have something like
<a.b.c></a.b.c>
. This mean thatobject:
should be able to haveJsxMemberExpression
too.
Representing nested member expressions should be possible because object
is a JsxAnyIdentifier
where JsxMemberExpression
is a member of.
But these are only semantic differences that aren't validated in a mutation API anyway and may, thus, not be worth it
Also names with first capital letter are exceptions, e.g.
<Aside></Aside>
. Personally, I would prefer a new node because it makes easier to pinpoint these nodes inside analyzers. But I don't have strong opinions :)
Identifiers with capital letters are also valid in JS? There's nothing preventing you from writing let Aside = 10;
But I agree on the sentiment. They seem to be different enough to justify a new node.
Maybe
JsxChildExpression
is better?
The idea is that all variants share the same postfix (at least for those, that are specific for that union). That's why it is JsxExpressionChild
to make it clear it's a child and not a variant of JsAnyExpression
JsxMemberExpression // (JsxMemberName?)
Better
JsxMemberExpression
, which is inline with the other member expressions.
Agree, my main concern is that it can give the impression that JsxMemberExpression
is a variant of the JsAnyExpression
. The reason the other member names end with Expression
is that they are expressions. This isn't the case here. We could also go with JsxMemberIdentifier
Could you handle non-React JSX like SolidJS? SolidJS is gaining popularity, & examples' JSX is very close to React's, but inline CSS has normal (dash-case) property names, & some funky attribute prefixes.
@tomByrer are you mainly referring to the namespace:attribute
name syntax that solideJS uses? My understanding is that this is standard JSX and covered by the JsxAttributeName grammar that allows for a namespace name (namespace:name
)
See specification at: https://facebook.github.io/jsx/
Lazy Lexer
TokenSource
to be lazy. #2219Parser
will have new methods to drive the lexer.see: https://github.com/rome/tools/issues/2035
JSX Parser
Testing
[x] Test Suites
[x] From Microsoft/Typescript test suite:
[x] From Babel test suite
https://github.com/babel/babel/tree/main/packages/babel-parser/test/fixtures/jsx
see https://github.com/rome/tools/pull/2162
[x] Integrate test into CI