ravjot28 / esprima

Automatically exported from code.google.com/p/esprima
BSD 2-Clause "Simplified" License
0 stars 0 forks source link

Represent holes in ArrayExpression as EmptyExpression or ArrayHoleExpression #525

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The problem in holes in multi-line arrays.

var x = [
  , 4,, 5,
  ,, 7, 1,
  ,
];

It can't represent any info about them: location, line information are lost.

It should create some dummy Node without any values but with location info.

PR in GH: https://github.com/ariya/esprima/pull/241

Original issue reported on code.google.com by zxq...@gmail.com on 4 Apr 2014 at 11:08

GoogleCodeExporter commented 9 years ago
Take code or gimme review and I'll fix code. Np.

Original comment by zxq...@gmail.com on 4 Apr 2014 at 11:09

GoogleCodeExporter commented 9 years ago
I'm not sure about range and loc.start .end properties, please look at it in 
test diffs.

Original comment by zxq...@gmail.com on 4 Apr 2014 at 11:14

GoogleCodeExporter commented 9 years ago
What about other JavaScript tools? Are those going to break?

We also need to stick with Mozilla SpiderMonkey Parser API with compatibility. 
Thus, designing it with compatibility in mind will be helpful.

Original comment by ariya.hi...@gmail.com on 4 Apr 2014 at 11:55

GoogleCodeExporter commented 9 years ago
Related is Concrete Syntax Tree (issue 508).

Original comment by ariya.hi...@gmail.com on 4 Apr 2014 at 11:56

GoogleCodeExporter commented 9 years ago
Yeah, right. EmptyExpression is not good for it. Better to call it Literal with 
value: null, and raw: ''.
Right?

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:13

GoogleCodeExporter commented 9 years ago
Even better Literal with value undefined.

In SpiderMonkey they want null for some reasons.

interface ArrayExpression <: Expression {
    type: "ArrayExpression";
    elements: [ Expression | null ];
}

But as you see I'm strongly disagree this behaviour. Because there is undefined 
element, but not null or something.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:18

GoogleCodeExporter commented 9 years ago
This is a non-starter because of incompatibility with SpiderMonkey Parser API. 
If you're looking to preserve all syntactic information, the right route to 
take is parsing to a different CST structure.

Original comment by mich...@ficarra.me on 5 Apr 2014 at 12:21

GoogleCodeExporter commented 9 years ago
Some words about other JavaScript tools.
Holes in arrays is very unusual pattern and I think the most of tools just 
ignores that null. Probably with fatal errors like JSCS atm.
It's only the problem if they using null for some special behaviour. But that 
problem can be easily fixed by replacing `node === null` with `node.value === 
null`.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:21

GoogleCodeExporter commented 9 years ago
In the most cases it will only fix other JS tools.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:21

GoogleCodeExporter commented 9 years ago
mich, I didn't see in Parser API information when to use null in 
ArrayExpression. And why it should be null.
Anyway, the hole in array is an undefined Literal but not null.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:30

GoogleCodeExporter commented 9 years ago
Probably it's a bug in escodegen, btw:
var x = [,];
var x = [2,];
Renders to:
var x = [,];
var x = [2];

So it's not so critical to fix (not just change) behaviour in this place.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 12:36

GoogleCodeExporter commented 9 years ago
As I see even firefox uses null atm. It's not right but it's not a bug in 
esprima.
Need to get a decision: change Parser API or don't.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 1:09

GoogleCodeExporter commented 9 years ago
I've reworked PR, changed non-standard EmptyExpression with Literal (undefined, 
''). So now it's right.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 2:00

GoogleCodeExporter commented 9 years ago
Parser API:
Note: Because this library uses null for optional nodes, it is recommended that 
user-defined datatypes not use null as a representation of an AST node.

Standard telling us to use Expression or null. SpiderMonkey uses null because 
it doesn't care about it. But esprima shouldn't be a monkey and copycat 
SpiderMonkey behavior. Literal implements Expression. So it's not a problem to 
use Literal node instead of null.
So I think it should be merged.

What you think about it?

Original comment by zxq...@gmail.com on 5 Apr 2014 at 2:05

GoogleCodeExporter commented 9 years ago
That is not a bug in escodegen. A final comma is an elision, not a hole. And 
Literal does not allow an `undefined` value as its `value`, so a correct tool 
right now might check that `value == null` and assume that it represents a 
`null` value, making your suggestion backwards incompatible. Having a 
distinction between null/undefined and undefined-own-property/no-property is 
typically bad design in the JavaScript world. Also, please stop splitting your 
messages into many posts.

Original comment by mich...@ficarra.me on 5 Apr 2014 at 3:14

GoogleCodeExporter commented 9 years ago
> so a correct tool right now might check that `value == null` and assume that 
it represents a `null` valu
Nope. Hole in array is undefined, not a null.

> Having a distinction between null/undefined and 
undefined-own-property/no-property is typically bad design in the JavaScript 
world
Using null instead of AST/CST Node is truly bad design.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 3:17

GoogleCodeExporter commented 9 years ago
So it won't be lost: this is related to jscs 
https://github.com/mdevils/node-jscs/issues/315.

Original comment by ariya.hi...@gmail.com on 5 Apr 2014 at 3:20

GoogleCodeExporter commented 9 years ago
According to 
https://developer.mozilla.org/en-US/docs/SpiderMonkey/Parser_API#Expressions:

  interface ArrayExpression <: Expression {
    type: "ArrayExpression";
    elements: [ Expression | null ];
  }

Original comment by ariya.hi...@gmail.com on 5 Apr 2014 at 3:20

GoogleCodeExporter commented 9 years ago
According to 
https://developer.mozilla.org/en-US/docs/SpiderMonkey/Parser_API#Builder_objects

  Note: Because this library uses null for optional nodes, it is recommended that user-defined datatypes not use null as a representation of an AST node.

So it's valid to use Expression there instead of useless null. They need it for 
a little speed boost. But when we parsing CST/AST tree we should use Expression.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 3:24

GoogleCodeExporter commented 9 years ago
The above note ("Because this library uses null...") applies only for Builder 
object, i.e. allowing the consumer to define its own syntax format. It is 
specific to SpiderMonkey (not supported in Esprima) right now. It basically 
warns that any custom node type shouldn't use null.

In other words, it has nothing to do with the de facto specification of 
ArrayExpression.

Original comment by ariya.hi...@gmail.com on 5 Apr 2014 at 3:28

GoogleCodeExporter commented 9 years ago
No no. It means that SpiderMonkey uses null for optionals. But it's not a rule 
for others. They can use Expression as well for any nodes.

Probably I didn't see something? And there is a rule for holes in arrays?

Original comment by zxq...@gmail.com on 5 Apr 2014 at 3:39

GoogleCodeExporter commented 9 years ago
"But it's not a rule for others", that exactly what I mean. IF you are creating 
your own AST format, you are free to use whatever you like (excluding null). 
But that's not what Esprima is about (it follows this de facto standard).

An example of another optional node which may be null (see the filter property):

  interface ComprehensionExpression <: Expression {
    body: Expression;
    blocks: [ ComprehensionBlock ];
    filter: Expression | null;
  }

For your last question, I already showed the format for array (here it is 
again):

  interface ArrayExpression <: Expression {
    type: "ArrayExpression";
    elements: [ Expression | null ];
  }

Original comment by ariya.hi...@gmail.com on 5 Apr 2014 at 3:45

GoogleCodeExporter commented 9 years ago
Ariya, as you already said standard is to use Expression or null in elements.

At my POV null can be used only if it not exist in source tree. But if we look 
closer to arrays with holes it's another thing:

  var x = [ 5, /*1*/
    , 3, /*2*/
    ,
    1, 2, /*3*/

    , /*4*/
  ];

(1), (2) and (3) is holes, (4) is not.
(1) have 2 lines presence, as well as (2). But (3) have 3 lines presence.
But if you place there `null` this information will be lost and syntax tree 
will be broken.

I would say holes in arrays aren't optional items. Using hole is a shortcut for 
undefined so there are undefined elements, not holes or nulls.

Once again. Where is the rule for holes in arrays? I see that null can be used 
in elements property. Atm I just see that SpiderMonkey uses null and that's the 
reason of `[ Expression | null ]` format in specification. It didn't say us to 
use nulls for holes. And it says that SpiderMonkey uses nulls for `optionals`. 
Holes in arrays aren't optionals for syntax trees because these are undefined 
elements.

If you hardly denied this just close issue. I see a bug in code and 
specification. You don't. We both understand specification differently. You 
prefer to copy SpiderMonkey behaviour, I didn't, because SpiderMonkey is used 
for other problems. So they didn't need accuracy.

Original comment by zxq...@gmail.com on 5 Apr 2014 at 4:06

GoogleCodeExporter commented 9 years ago
A hole is not a "shortcut for undefined". `[,,]` is not the same as `[void 0]` 
because the former won't have an own-property named `0`, but the latter will. 
Please stop making these assertions before learning more about sparse arrays.

Original comment by mich...@ficarra.me on 6 Apr 2014 at 5:00

GoogleCodeExporter commented 9 years ago
Thanks mich. Very productive talk.
How to solve this problem?

Original comment by zxq...@gmail.com on 6 Apr 2014 at 6:29

GoogleCodeExporter commented 9 years ago
"this information will be lost and syntax tree will be broken." -> Can you 
elaborate this?

Also, if you want to propose a change to SpiderMonkey Parser API, you should 
file a bug at https://bugzilla.mozilla.org/.

Original comment by ariya.hi...@gmail.com on 6 Apr 2014 at 3:04

GoogleCodeExporter commented 9 years ago
Ariya, I think it's not a bug because Expression can be empty.
To use null in elements instead of Expression is a decision of developers. So 
community must decide. Not me.

About your question. I just think that Context Tree should have information 
about each node. Null is looks like broken thing because nobody expects null in 
elements (e.g. `jscs` authors didn't expect that, and some other too, see list 
below).

Some broken code:
https://github.com/abc-team/node-kpc/blob/920923344d92a726cd44087173fec6c5021292
b4/lib/tools.js#L65
https://github.com/Constellation/esmangle/blob/cd23e8428936488d6154ccb4b04522385
d246d60/lib/evaluator.js#L175
https://github.com/jaredhanson/component-amd/blob/7f4fb151f8cad7a68972a6d619906d
697888762f/lib/amd2cjs.js#L54 and #L67
https://github.com/anodynos/uRequire/blob/3977850247ad27d21d849e164d41dd3b7895b7
69/source/code/fileResources/Module.coffee#L123
https://github.com/ablesky/ablejs/blob/0abaef0caef8401dd70185146b57e057a16bbb3f/
lib/utils/jsbin.js#L147
https://github.com/mariocasciaro/angular-extender/blob/ea6167ca58e5e2b9c79a00d8b
b7a9b3cf516e909/index.js#L37
https://github.com/goatslacker/akira/blob/5ceb7941d7dc8a4e8369f91bc1ccebeef68f1b
5d/node_modules/TypeSystem.js#L21 (almost broken ;-)
https://github.com/mixu/amdetective/blob/d882f6cd299e1903f511bbc37c9026f49800a1a
1/lib/parse.js#L91
etc.

And there're some other middleware modules which just pass null as node. 
Unexpected null will throw a fatal in many projects.
Like this: 
https://github.com/Kami/node-unused/blob/e7276771e32953c62a3d32bc45e15eba5fbf1f0
d/index.js#L141

I think there is a standard where we can use null. But once again: 
Array.<Expression|null>. It's our choice what to use. And as you can see nobody 
expects there null.

Original comment by zxq...@gmail.com on 6 Apr 2014 at 7:21

GoogleCodeExporter commented 9 years ago
Hi zxqfox,

[,] and [undefined] have very different behavior. Just like ({ x: undefined }) 
and ({}) are different. It's also totally different from [null], which will be 
parsed as an array with a literal object whose type is null.

[,] is an unassigned array object whose length is 1. If you call forEach on it, 
nothing will happen. [undefined] will be an array with only one element which 
is undefined. If you call forEach on it, the callback will happen exactly once.

In the AST, null are used to represent missing elements, aka "hole"s, it is 
never been used as a null literal. For example, if there is an if statement 
that is missing the alternative branch, the branch will be filled with null in 
the AST. It does not mean the alternative branch is a null literal.

> Nope. Hole in array is undefined, not a null.
Nope. A hole in an array is unassigned, not an undefined. When you call 
[[GetProperty]] from this array, it returns undefined, just like ({}).x. Also, 
a null in AST is not a null literal, it represents a hole.

If there are broken codes that does not expect an null in the AST when it's 
available, that will be a bug in the downstream, not the parser itself. Just 
like if the library cannot accept a null in a IfStatement's alternative branch. 
It shall be fixed. 

Thanks,
Bei

Original comment by ikarienator on 7 Apr 2014 at 2:00

GoogleCodeExporter commented 9 years ago
Thanks for an explanation, Bei. Anyway, missing elements shouldn't be nulls. It 
should be Expressions in CST, or even AST.

> If there are broken codes that does not expect an null in the AST when it's 
available, that will be a bug in the downstream
Nobody expects null. All expects Expression nodes. The bug in concept, not in 
the downstream. Almost all libraries who uses ArrayExpression have this bug 
bro. Nothing personal. Almost NOBODY expects null except Esprima project 
members and contributors.

> Just like if the library cannot accept a null in a IfStatement's alternative 
branch.
What is a case with null in IfStatement?

Conclusion:
Ariya taking care of libraries, libraries are broken atm and wouldn't fail if 
nulls will be replaced with Expressions.
The standard said use Expression OR null. All here want to use null because 
they just want and protecting this behavior. But it's actually a "Workaround" 
and should be moved out. Nobody cares? Okay.
Ariya said that I can place a bug report in Parser API. But there's no bug in 
Parser API. There's a workaround in project.
Mich is (seems like) the only one who knows about this behavior outside of 
Esprima project. And instead of productive speech, instead of trying to think 
about the problem, he just teaching me JS. Ok bro. Thanks. I can open my 
console in chrome/firefox/nodejs and place there "0 in [,,]". You're right. But 
it's not about the problem so nobody cares. Be more polite next time.
Bei, you're right about js internals. But once again: who cares if it's not a 
solution to a problem. Please focus on the problem, not on my knows of the 
language. Holes in the arrays exists. Nulls is a workaround that brokes A LOT 
of dependent projects. You think ALL this project must fix bugs? Nice feedback 
bro. Very good. The best I think!

Think different. (c)
Or go out.

Original comment by zxq...@gmail.com on 7 Apr 2014 at 2:20

GoogleCodeExporter commented 9 years ago
Guys, can we add an option to store meta information about these elements into 
program object as well as comments and tokens do?

Original comment by zxq...@gmail.com on 12 Apr 2014 at 11:11

GoogleCodeExporter commented 9 years ago
@zxqfox I'm curious how did you gathered all these bugs? Do you have some 
crawler to detect it? Can you try to scan whether people uses element === null 
type of test for ArrayElements as well?

Original comment by ikarienator on 13 Apr 2014 at 9:10

GoogleCodeExporter commented 9 years ago
@ikaenator https://www.npmjs.org/browse/depended/esprima
I've took some random packages from here. Seen element === null just once or 
something, and several times seen if (element). All others like that: 
https://github.com/veged/xjst/blob/master/lib/xjst/utils.js#L269 - just 
node.type right from start, nobody expect it's null.

Original comment by zxq...@gmail.com on 13 Apr 2014 at 2:24

GoogleCodeExporter commented 9 years ago
@zxqfox As a start: https://github.com/Constellation/esmangle/pull/97

Original comment by ikarienator on 13 Apr 2014 at 3:30

GoogleCodeExporter commented 9 years ago
@ikaenator Yeah. It'd be nice if you will fix all the bugs according to this.

Bro, are you kidding? Instead of good ecosystem with 1 fix you prefer to make 
fixes in N projects.
No comments.

Original comment by zxq...@gmail.com on 13 Apr 2014 at 6:08

GoogleCodeExporter commented 9 years ago
I think we just can't understand the nature of the problem.
SpiderMonkey Parser API designed for parsing and executing JS-code. It wasn't 
designed for parsing any kinds of JS-entities. And it will not be designed for 
it. Esprima just copied this design and sticked to it like a limpet. When 
somebody want to know what comments exists in JS-code then somebody else added 
a hack to Esprima: program.comments . When somebody need tokens — 
program.tokens was implemented. Same for all. Bad feelings because of bad 
design.
E.g. we can add an option like emptyArrayElements to store in emptyElements 
property a bunch of objects with location meta data in ArrayElements object. Or 
something else. But it looks like a hack, not a good design at all.

IMHO we should refactor the design first of all. E.g. split up token parser and 
AST builder. We should add a lot of additional Node types like LineComment, 
BlockComment, EmptyArrayElement (or Literal with raw: '' and with no value or 
something). And then we could make a SpiderMonkey-compatible AST or any other 
needed. But it's a lot of hard work.

I've already fixed a bug in jscs and atm I don't need anything else. I wish 
we'd have a better Esprima design in the future. Or just another project.

Original comment by zxq...@gmail.com on 13 Apr 2014 at 6:26

GoogleCodeExporter commented 9 years ago
zxqfox, there is no need to be snarky. Problems like this can always be solved 
with a discussion in a very civilized tone.

Esprima is by design following the de-facto SpiderMonkey API for the reason of 
avoiding too many different parser API standard. Nobody claims that that API is 
perfect. Everyone involved in Esprima's project already realizes that (in fact, 
Michael Ficarra had a presentation about it).

However, where SpiderMonkey Parser API falls short, we try to solve it by 
proposing an enhancement. An example is issue 376 (and see the corresponding 
upstream Mozilla bug https://bugzilla.mozilla.org/show_bug.cgi?id=808230). If 
you are passionate about this, I suggest again (I already mentioned this in my 
comment 26) to file an issue at https://bugzilla.mozilla.org.

Unless there is an additional action someone wants to take, there is no need 
for a further discussion cause we will be just going round in circles.

Original comment by ariya.hi...@gmail.com on 16 Apr 2014 at 3:01