power-assert-js / power-assert-formatter

Power Assert output formatter
MIT License
6 stars 3 forks source link

support async/await #20

Closed jamestalmage closed 8 years ago

jamestalmage commented 8 years ago

switches parser back to acorn.

currently depends on patched versions of espower and acorn-es7-plugin.

I will update and squash commit as those PR's are accepted.

Ref: https://github.com/MatAtBread/acorn-es7-plugin/issues/2 Ref: https://github.com/power-assert-js/espower/pull/21

jamestalmage commented 8 years ago

@MatAtBread Could you comment on how awaitAnywhere works with regards to my statements above?

power-assert works by parsing individual lines of code at runtime (the line where the assertion failed), and using the AST to do some fancy output.

Currently, there is no way for us to tell if we are within an Async function or Generator or neither. This makes parsing Contextual keywords (like await) difficult. Any help you can give on the topic would be appreciated.

MatAtBread commented 8 years ago

The awaitAnywhere flag works by attempting to re-parse the the source as an expression (it does this by recusrively parsing at the 'start-4', making 'await' look like 'wait'). If it successfully parses, the plugin just assumes it's a valid identifier, so code like await+10 (parsed as wait+10) is ES6. Specifically, code like await someOtherExpression is not valid ES5/6, so it's assumed to be an AwaitExpression.

The only ambiguity I've found is the case you bring up: await ( expression ) correctly parses as a CallExpression in ES5/6, but could be a parenthesized AwaitExpression outside of an async function. The plugin currently plays safe, assuming it's a CallExpression. Given the precedence of await (below . [], etc), this seems fine, except when the following node is a IIFE - i.e. the sequences

await (async function() { ... }())
await ((async ()=>expr)())

...can never be (correctly?) parsed as an AwaitExpression (except, of course, when they are inside an async function).

The only other possible solution to this I can see is to check the scope for an identifier called await and use it's presence to indicate which route to take. However, as well as being non-obvious, this makes the outcome variable on whether you're parsing a fragment or a complete Program, which is probably a bad idea.

The workaround is to use a temporary to 'force' await to be an AwaitExpression:

let fn = async function() { ... } ;
await fn();  // Definitely not a call to a function called 'await' 

It's not ideal, but it's the route taken in the plugin currently.

Thanks for all PR, BTW, I was a bit sloppy in source mapping, and the Property.async is a definitely a bug. I'll try to review and merge in the next few days - your updates crossed with some significant changes I made to accommodate babel 6, so it'll have to be a manual merge (unless you want to pull and re-fix :) )

twada commented 8 years ago

@jamestalmage @MatAtBread Thank you for your explanation. Now I understand the corner case.

@jamestalmage ping espower 1.2.0 is out.

@MatAtBread nice to see the PR being merged!

jamestalmage commented 8 years ago

@MatAtBread

Below is a little background on the project so you can understand the goal of the question that follows.

So the power-assert process actually happens in two stages. Code instrumentation and runtime.

1. Code Instrumentation

During code instrumentation, we pass the entire program to the parser. The problem we are discussing will not happen here (because the parser can know we are inside an async function). At this stage we take the original code:

  assert(fuga === piyo);

and turns it into this

assert(
  assert._expr(
    assert._capt(     // capture the result of the === comparison
      assert._capt(   // capture the left side
        fuga,
        'arguments/0/left'  // the node-path to the captured left side
      )
      ===
      assert._capt(   // caputure the right side
        piyo,
        'arguments/0/right'  // the node-path to the captured right side
      ),
      'arguments/0' // the node-path to the captured === comparison
    ),
    {
      content:'assert(fuga === piyo)',  // Store the original text (this is what we use runtime)
      filepath:'path/to/some_test.js',
      line:1
    }
  )
);

2. Runtime

At runtime the assert._expr above will run the assertion, and if it fails use the "captured" results and node-paths to produce a fancy output.

Note the content line above, all we have to send to the acorn parser is the single assertion expression:

      content:'assert(fuga === piyo)',  // Store the original text (this is what we use runtime)

With the introduction of the await and yield as contextual keywords, things get more complicated for us. Until now the project has been able to ignore the surrounding context, but to get these right - I think we may now need to change that.

My Question

If we were to embed some additional context information (i.e. isInsideAsyncFunction) into the instrumented code, like so:

    {
      content:'assert(fuga === await (piyo) )',  // Store the original text (this is what we use runtime)
      isInsideAsyncFunction: true,
      filepath:'path/to/some_test.js',
      line:1
    }

Is there a way to set the starting state on acorn such that your parser would know it was within an async function?

If not, would you be open to adding a config option alwaysAsync (etc). That we could use to signal that?

MatAtBread commented 8 years ago

The 'fudge' way of doing is to take your code, and jsut before parsing it, wrap it in the source text async function fudge(){ ...}, parse it take your tree as ast.body.body[0], where ast is the tree returned by acorn. It's similar to the way Node wraps your module in an IIFE - you don't see it, but it's there.

I don't mind adding the flag, however it means you won't be able to move parsers easily, and the contextual bit is important - there's a very popular module called 'async' and it turns up as an identifier in many, many source files, as does await, so assuming you're always in an async (or even a generator) is a good way to break a lot of source. I know this because nodent did the same thing for a long time!

Note that you can 'flip out' of it (I think) with code like

var async = 123 ; // Just a var identifier
async function xxx() {
    var await = 456 ; // Illegal - await is a keyword here
    function abc() {
        return async+789 ; // Legal - I think - as you're not in an async function anymore
    }
}
jamestalmage commented 8 years ago

We are able to tell if we are inside an async function or not during instrumentation, so it's just a matter of embedding that information for our runtime to use.

I was thinking we could use that embedded information to pass as an option to your parser, but you make a good point w/regard to portability. We actually are already doing the wrapper function, but it's in a try / catch block (we try to parse without the wrapper, then with). That is problematic since, as discussed here, there are certain statements containing await that both parse validly, but with different semantics depending on context.

Rather than an option to your parser, I think it's better that we continue to wrap, but we will need to do so in a contextual way (based on values we set during instrumentation).

@twada Is what I'm saying making sense to you?

MatAtBread commented 8 years ago

OK. I think I get it now. You want to be able to save and restore the context of a node so you can re-parse the source.

Introducing flags for that isn't too hard, although I can't help thinking that it might be better to store the node (since it's only JSON) in your structure and recreate the source from that.

jamestalmage commented 8 years ago

although I can't help thinking that it might be better to store the node (since it's only JSON) in your structure and recreate the source from that.

Because the location data in the existing AST would be all off. This is our output:

  assert(bazz.ary.indexOf(zero) === two)
              |   |       |     |   |
              |   |       |     |   2
              |   -1      0     false
              [1,2,3]

The user may have had that deeply nested that assert in callbacks, and included newlines and unnecessary parens that don't reproduce when printing the AST

describe('foo', () =>
  describe('bar', () =>
    it('baz', () => assert(
       bazz.ary.indexOf(zero) 
       === 
       (two)
    )
  )
)

The content we store is not actually the user source, but a reprint generated from the AST via escodegen.print(ast)

@twada He does bring up an interesting point though. Why don't we reparse the individual line during instrumentation and save the entire AST alongside content?

twada commented 8 years ago

@jamestalmage

Is what I'm saying making sense to you?

Your explanation is perfect.

He does bring up an interesting point though. Why don't we reparse the individual line during instrumentation and save the entire AST alongside content?

Yes. Sounds attractive now. I've embedded AST substree into context once, then reverted it by using parsers at runtime.

This strategy (parsing at runtime) works well in ES5 era, but now we found its limitations in ES6+ era.

I know you are going to disambiguate between function calls and async/yield expressions in code Instrumentation stage. It's fine in short term. We can solve ambiguous async/await problem by introducing flags.

On the other hand, your suggestion Saving the AST subtree into content strategy, has a huge advantage for us. When we have AST subtree, we don't need any parsers at runtime. This solves the hard problem, selecting parsers (esprima, espree, acorn, babylon) to be embedded into power-assert runtime. The last problem for us is, without runtime parsers, we have to adjust node and token locations by ourselves to get fancy output, maybe using some offset processing.

So, let's solve current problems by introducing flags, then I'm willing to solve harder problems with huge restructuring.

jamestalmage commented 8 years ago

@twada Just need empower release cut and I can finish this up.

twada commented 8 years ago

@jamestalmage Just released empower 1.1.0.

jamestalmage commented 8 years ago

@twada ready for review and release

twada commented 8 years ago

Just released as power-assert-formatter 1.3.0 :tada:

MatAtBread commented 8 years ago

Hey - not sure if you need/want to upgrade acorn. They introduced a breaking change in 2.6.x that prevented the plugin from working (specifically, disallowing 'await' and 'async' before the plugin could handle them). I've updated the acorn-es7-plugin (1.0.10) that handles this. It also works fine with the earlier acorn parser it used (2.5.2)

jamestalmage commented 8 years ago

Thanks @MatAtBread

@twada - let me know if you run into trouble bumping dependencies and need any help tracking down issues.

twada commented 8 years ago

@MatAtBread Thank you for your comment. Helps me a lot.

let me know if you run into trouble bumping dependencies and need any help tracking down issues.

@jamestalmage Thanks. I should have reported it to @MatAtBread before pinning acorn. So sorry for that.