Closed ghost closed 7 years ago
ESTree as a specification only determines the output structure, and only invokes types in the conceptual sense, not the literal sense.
@mikesherov I agree in that, but when the deviation is greater than 53% from the specs, and also taken into account that Acorn now more or less follows the specs 80%. And end-users now choose Acorn over Esprima because of it's modularity and ESTree compat - even if Esprima has way better performance - this is a huge gap from the Estree draft / specs in my world.
The Abstract Syntax Tree is not compatible either with Acorn. E.g. the location of the node is different.
And Espree - started as a fork of Esprima - now prefer AND used Acorn as it basis because of this things.
React also used a Esprima fork until december 2015, but out of same reasons they are now using a Acorn/Babel combo.
Valid reasons enough to try to be Estree compatible.
I - myself - choose Esprima because of the high quality of the code and professional developers with years of experience. At least @ariya seems to have it. And ofc. Esprimas performance.
@nowindowsowrking, thanks for your contributions so far.
The ESTree spec was born out of a collaboration between Mozilla, Esprima, and Acorn, and an evolution and standardization of the SpiderMonkey AST. Esprima and Acorn are both 100% compatible with ESTree, because types have never been part of the spec, but rather an implementation detail. Up until the switch to TypeScript, Esprima itself didn't have any specific constructors for the different nodes anyway.
In regards to location information, this is also not part of the ESTree spec.
Lastly, there are many reasons projects choose either Acorn or Esprima or Babylon. They all have their own tradeoffs, and Esprima has chosen to focus on correctness and speed as primary concerns over modularity or implementing < stage 4 features.
As a member of the ESLint team myself, there are a few reasons Espree ultimately switched to Acorn, but none of them were because of adherence to types. ESLint itself only relies on the structure of the AST, and does no type checking.
Babel built babylon off of Acorn primarily because of its modularity. Again, nothing to do with Type checking nor adherence to ESTree, especially considering Babylon is non-ESTree compatible.
So it's ultimately up to @ariya whether he considers the type mismatches a violation of the spec and worth addressing, but being a founding member of the ESTree spec myself, and intimately involved in the situations you referenced, I can safely say that types are not a part of those decisions.
Thanks for following up and pushing us on this!
@mikesherov I'm impressed about your knowledge :) I noticed however now after reading all versions of this code since 1.0 some of the Esprima code is very old and has not been changed since the beginning. Meanwhile the Estree specifications have changed. But I agree it's up to @ariya to make a decision regarding this matter.
Thanks @nowindowsowrking and @mikesherov!
Before I post my detailed comment, here is a slightly relevant side note.
To understand the landscape of JavaScript parsers, it is illustrative to digest other related materials:
Also, for a meta topic like this, I can recommend an alternative forum (and sometimes better): posting it to our mailing-list instead: https://groups.google.com/forum/#!forum/esprima.
ESTree specification does not include a list of compatibility requirements. Therefore, when there is a claim of "X is compatible with ESTree", it means (unfortunately) we have to analyze it from X's perspective.
In this particular context of Esprima, the criteria being used to specify ESTree compatibility is as follows:
If there is a tool Y that is constructed only by reading the ESTree specification, then such a tool must function when it is consuming the syntax tree produced by Esprima parser.
The above compatibility principle allows Esprima to offer an enriched syntax tree ("superset") that benefits those who want to take advantage of it, and at the same time it keeps the output useful for tools which can only understand the ESTree format only. This also means that such a format extension needs to be additive, i.e. the removal of any extension should cause the format falls back again to ESTree without any further modification.
A classic example of this additive extension is the raw
property for a literal. In the ESTree specification, it says that:
interface Literal <: Expression {
type: "Literal";
value: string | boolean | null | number | RegExp;
}
However, if we examine Esprima output, then there is another property there. This was added a long time (early 2012, see commit 0fe81b279) due to a feature request. Apparently, being able to obtain the raw literal is supposed to be useful for code rewriter/formatter.
> esprima.parse('0x2a').body[0].expression
Literal { type: 'Literal', value: 42, raw: '0x2a' }
Meanwhile, if there exists a tool that can handle every literal only according to ESTree specification, then that tool will be completely oblivious to the existence of the raw
property. Thus, this raw
property is an additive enrichment and not to be considered as a specification violation.
In ESTree, every interface is described following the concept of structural type system, i.e. a node is considered of a certain type if the node matches the interface structure and not by its name. As an example, look at the following interfaces:
interface Node {
type: string;
loc: SourceLocation | null;
}
interface Program <: Node {
type: "Program";
body: [ Statement ];
}
With that in mind, the following JavaScript code produces a perfectly valid Program
node:
function createEmptyProgram() {
return {
type: 'Program',
body: [],
loc: null
};
}
Note how the function createEmptyProgram()
constructs a plain JavaScript object, without any inheritance whatsoever (from the Node
object). This does not mean that the object created by createEmptyProgram()
is not ESTree compatible.
In fact, in the beginning of ESTree specification, it is stated that "ESTree AST nodes are represented as Node
objects, which may have any prototype inheritance…" (note the use of may, not must).
Let's address each compatibility concern.
For a loop statement using for
, ESTree specifies the following interface:
interface ForStatement <: Statement {
type: "ForStatement";
init: VariableDeclaration | Expression | null;
test: Expression | null;
update: Expression | null;
body: Statement;
}
Esprima output matches that:
> esprima.parse('for (i = 0; i < 3; ++i);').body[0]
ForStatement {
type: 'ForStatement',
init:
AssignmentExpression {
type: 'AssignmentExpression',
operator: '=',
left: Identifier { type: 'Identifier', name: 'i' },
right: Literal { type: 'Literal', value: 0, raw: '0' } },
test:
BinaryExpression {
type: 'BinaryExpression',
operator: '<',
left: Identifier { type: 'Identifier', name: 'i' },
right: Literal { type: 'Literal', value: 3, raw: '3' } },
update:
UpdateExpression {
type: 'UpdateExpression',
operator: '++',
argument: Identifier { type: 'Identifier', name: 'i' },
prefix: true },
body: EmptyStatement { type: 'EmptyStatement' } }
For a function declaration, let's look at the following ESTree interfaces for ES5:
interface Function <: Node {
id: Identifier | null;
params: [ Pattern ];
body: BlockStatement;
}
interface Declaration <: Statement { }
interface FunctionDeclaration <: Function, Declaration {
type: "FunctionDeclaration";
id: Identifier;
}
and for ES2015:
extend interface Function {
generator: boolean;
}
Meanwhile, the output of Esprima:
> esprima.parse('function f(){}').body[0]
FunctionDeclaration {
type: 'FunctionDeclaration',
id: Identifier { type: 'Identifier', name: 'f' },
params: [],
body: BlockStatement { type: 'BlockStatement', body: [] },
generator: false,
expression: false }
For a regular expression literal, ESTree specifies:
interface Literal <: Expression {
type: "Literal";
value: string | boolean | null | number | RegExp;
}
interface RegExpLiteral <: Literal {
regex: {
pattern: string;
flags: string;
};
}
And if we let Esprima process a regular expression:
> esprima.parse('/abc/i').body[0].expression
RegexLiteral {
type: 'Literal',
value: /abc/i,
raw: '/abc/i',
regex: { pattern: 'abc', flags: 'i' } }
There is hardly difference than what ESTree mandates. This is not a surprise, treating a regular expression as a special form of literal was in fact originated first in Esprima itself (see commit 2641aff502, Jun 2014), adopted by other parsers, and finally proposed and made it into ESTree (Feb 2015).
With this explanation, hopefully it is demonstrated that Esprima is indeed compatible with ESTree.
@ariya I was reading this in the readme - "Sensible syntax tree format as standardized by ESTree project"
So I compared with EsTree, and found that Esprima isn't Estree compatible. First of all everything should inherit from a Node class. That doesn't happend in Esprima. Here is a few of my findings
https://github.com/estree/estree/blob/master/es5.md#identifier
In
Esprima
there is an extra raw field. And thisForStatement
are missing the update field. https://github.com/estree/estree/blob/master/es5.md#forstatementAnd this node totaly break with
EStree
:FunctionDeclaration
https://github.com/estree/estree/blob/master/es5.md#functiondeclarationThe
FunctionDeclaration
has a lot of "fields" that isn't in the specs at all. The same is the case forFunctionExpression
.And following "nodes" are missing
RegExpLiteral
There are a lot more violations of the specs. I just scratched the surface.