binast / binjs-ref

Reference implementation for the JavaScript Binary AST format
https://binast.github.io/binjs-ref/binjs/index.html
Other
431 stars 38 forks source link

async/await is not part of ES6 #420

Open RReverser opened 5 years ago

RReverser commented 5 years ago

I've noticed that there is isAsync and AwaitExpression in es6.webidl, but they weren't actually part of ES6 and, correspondingly, also not supported by currently included version of the Shift parser.

They should be moved out to a separate IDL file or the file should be renamed not to include a version.

arai-a commented 5 years ago

iiuc, "es6" there actually doesn't mean ES6, but ES6+. but yeah, having separate extension IDL for each version might help future update.

Yoric commented 5 years ago

The toolchain should mostly support several IDLs already.

Before we use it, we'll need to solve a few issues, though:

  1. Order inside enums and sum types. Since context encoding depends on such an order, we want language extensions such as AwaitExpression to add stuff at the end of the file. Not a big deal, but needs to be done.
  2. We'd need to fork all the nodes affected by this issue into an es6 version without isAsync (e.g. FunctionExpression) and a post-es6 version with isAsync (e.g. FunctionWithAsyncExpression) and introduce the *WithAsync values into the relevant sum types. Actually, we could entirely get rid of isAsync and replace this with FunctionExpression/AsyncFunctionExpression, MethodExpression/AsyncMethodExpression...

Does anyone wish to work on this? Otherwise, I can take a look once we're more advanced with our current context-0.1 sprint.

arai-a commented 5 years ago

talked with Yoric, and here's summary.

regarding IDL versions (and format versions), there are several things to consider.

here's the example based on async function' case:

How to extend AST (IDL)

1. just add field to existing interface

define an extension in IDL that adds field to existing interface. the resulting interface will have new field.

es6.webidl

interface EagerFunctionExpression : Node {
  attribute boolean isGenerator;
  attribute BindingIdentifier? name;
  attribute unsigned long length;
  attribute FrozenArray<Directive> directives;
  attribute FunctionExpressionContents contents;
};
interface LazyFunctionExpression : Node {
  attribute boolean isGenerator;
  attribute BindingIdentifier? name;
  attribute unsigned long length;
  attribute FrozenArray<Directive> directives;
  [Lazy] attribute FunctionExpressionContents contents;
};

typedef (EagerFunctionExpression or
         LazyFunctionExpression) FunctionExpression;

es7.webidl

[ExtendsInterfaceField=EagerFunctionExpression]
interface EagerFunctionExpression : Node {
  attribute boolean isAsync;
};
[ExtendsInterfaceField=LazyFunctionExpression]
interface LazyFunctionExpression : Node {
  attribute boolean isAsync;
};

Pros

Cons

2. add yet another interface for new type

add interfaces that corresponds to the language extension. if that can be expressed with bool, no field will be added. (EagerFunctionExpression for sync function and EagerAsyncFunctionExpression for async function)

es7.webidl

[ExtendsTypeSum=FunctionExpression]
interface EagerAsyncFunctionExpression : EagerFunctionExpression {
};

[ExtendsTypeSum=FunctionExpression]
interface LazyAsyncFunctionExpression : LazyFunctionExpression {
};

the result of FunctionExpression sum will be:

typedef (EagerFunctionExpression or
         LazyFunctionExpression or
         EagerAsyncFunctionExpression or
         LazyAsyncFunctionExpression or) FunctionExpression;

Pros

Cons

3. add yet another interface with flag

add interfaces that has flag, that corresponds to the language extension. existing interface still exist.

es7.webidl

[ExtendsTypeSum=FunctionExpression]
interface EagerFunctionExpressionES7 : EagerFunctionExpression {
  attribute boolean isAsync;
};

[ExtendsTypeSum=FunctionExpression]
interface LazyFunctionExpressionES7 : LazyFunctionExpression {
  attribute boolean isAsync;
};

the result of FunctionExpression sum:

typedef (EagerFunctionExpression or
         LazyFunctionExpression or
         EagerFunctionExpressionES7 or
         LazyFunctionExpressionES7 or) FunctionExpression;

Pros

Cons

How to extend format

in any case above, "context" format's model table becomes incompatible with older versions. this issue happens regardless of whether the encoded JS code itself uses the new syntax or not.

so, things to consider:

What implementations should support

  1. should implementation support older IDL and format?
    • if server returns older version, what should the implementation do?
    • should implementation request with detailed version number for supported version? (Accept header)
  2. should implementation support unknown newer IDL and format?
    • should new IDL/format somehow backward-compatible?
    • if server returns newer version that implementation doesn't know, what should it do?
    • should define error semantics about unknown things
      • if the format uses newer IDL but the code doesn't use it, should it run wihtout error?
      • in plain JS case, SyntaxError happens only when new unknown things is actually used in the code
  3. should encoder support older IDL and format?
    • if the target JS code doesn't use newer things (like, if it's plain ES5), should encoder use older IDL?
Yoric commented 5 years ago

My take on this is that 3. add yet another interface with flag is the closest to the current behavior of JS, so we should strive for this. This assumes that the codec has a way to encode this without size/performance loss. I feel it's feasible to evolve context-0.1 in this direction.

Note that we can actually have es7.webidl

[ExtendsTypeSum=FunctionExpression]
interface EagerFunctionExpressionES7 : EagerFunctionExpression {
  attribute boolean isAsync = false;
};

[ExtendsTypeSum=FunctionExpression]
interface LazyFunctionExpressionES7 : LazyFunctionExpression {
  attribute boolean isAsync = false;
};

The = false ensures that we can automatically convert {LazyFunctionExpression, EagerFunctionExpression} to {LazyFunctionExpressionES7, EagerFunctionExpressionES7}. In practice, this means that we only write a parser for {LazyFunctionExpressionES7, EagerFunctionExpressionES7} and the auto-generated parser will (somehow) reuse these methods to generate the {LazyFunctionExpression, EagerFunctionExpression} methods.

As a compression side-note, this looks very much like a simple case of TreeRePair. Maybe looking at TreeRePair will give us another way to look at the problem.

Now, on your detailed questions, here's my take.

should implementation support older IDL and format?

Definitely. Firefox 124 cannot afford to reject files designed for Firefox 123 :)

if server returns older version, what should the implementation do?

We need to find a way to make it read the older version without over-complicating the implementation.

should implementation request with detailed version number for supported version? (Accept header)

As this is not what happens with JS nowadays, I would like to avoid this.

should implementation support unknown newer IDL and format?

I'm not sure what you mean by "newer format".

For newer IDL, I'd like to support it whenever possible. Just because the IDL and the encoder have changed, we do not want to stop supporting older browsers if we're only using older features.

   should new IDL/format somehow backward-compatible?

I'd like to.

   if server returns newer version that implementation doesn't know, what should it do?

should encoder support older IDL and format? if the target JS code doesn't use newer things (like, if it's plain ES5), should encoder use older IDL?

Good point. I think it should try.

dominiccooney commented 5 years ago

The format prototype in fbssdc punted on grammar versioning/evolution, but that should be fixed. I think we should wait until we have data from serving this format.

Specifically, if file sizes are still a problem, I think it's likely the next iteration of the file format will want to have multiple models per field, and start to exploit some of the regularities between different kinds of fields which are barely exploited today. A lot of nice solutions for evolution available with tweaks to the simple format we have today would be invalidated by these constraints.

It's still worth thinking about the shape of the problem now though.

We need to think specifically about the kinds of changes we might encounter. I think those are:

We should also think about goals. For example:

Achieving all of these will be hard but we can at least understand the trade-offs we are making.

Yoric commented 5 years ago

@dominiccooney I'll answer your message more completely when I have more time, just a few quick points.

I think we should wait until we have data from serving this format.

As far as I understand, Cloudflare and Facebook have different priorities here. Cloudflare prioritizes grammar compatibility.

Specifically, if file sizes are still a problem, I think it's likely the next iteration of the file format will want to have multiple models per field, and start to exploit some of the regularities between different kinds of fields which are barely exploited today.

We still need to run tests, but I don't think that file sizes are a problem at the moment, as long as we have proper support for codec dispatch. In particular, as soon as we actively work on streaming, we'll be faster pretty much regardless of the size of the file.