Using GraalJS for compiling and Beaver for AST to support ES6+ and all other editor features

Problem with the Graal JS:

https://github.com/graalvm/graaljs/issues/64 - AST error recovery strategy is missing (aka error-tolerant parser)

Graal JS is not an error tolerant parser - It doesn't build the AST if it found any syntax errors while parsing language. But which is very much essential to build the content proposals, outline view and formatting, etc.

Proposal: Combination of Graal JS + Beaver parser to get the benefit of both. Graal JS => For compiling so that will get the latest ECMA script support + type script Beaver => For building AST and traversing, which also support error recovery mechanism.

This is the quickest and safest way which can push the ES6+ support without compromising on any existing features such as content proposals, outline view and formatting, etc.

Going forward will continue to investigate and implement the best recovery strategies for GraalJS and remove the dependency on the Beaver parser completely.

Problem with this approach: There is might be a slight performance lag since we are compiling every file 2 times. But I couldn't notice that during our performance tests.

Beaver Parser: Scanner -> Beaver Parser -> Aptana AST ->

Steps involved in the Graal JS + Beaver: Beaver Lexical -> Beaver Parser -> Aptana AST + Graal Lexical -> Graal Parser ->

Performance Metrics: Running AllJSCoreTests which has 1064 Test Cases

Beaver + Graal JS combination took 5.317 seconds
Beaver took 5.174 seconds

Running PerformanceTests https://github.com/aptana/studio3/blob/7fa0608216dbcf80ba121b8426280aeb8ff35047/tests/com.aptana.js.core.tests/src/com/aptana/js/core/tests/PerformanceTests.java

Beaver: Runs 13/13 Time: 190.754 https://gist.github.com/kolipakakondal/21c9f34e19e9ef03c32bd5147b1fca6c

GraalJS +Beaver Runs 13/13 Time: 184.856 https://gist.github.com/kolipakakondal/454b82a5621cc6d24940c01cf3ceb850

Man, this is a huge PR, so I don't know that I can go through every single change here.

First: Why is this targeting master branch? Don't we already have the oracle graal parser landed on development branch? That'd cut down on the size of the PR and changes significantly.

Second: I don't understand the approach at a basic level. I specifically had moved to the graal parser to support ES6/7/8 syntax. Our beaver parser does not support ES6+ syntax. I tried updating our beaver parser to support the language changes, but was not able to generate a BNF that worked and was not ambiguous - nor was I able to find any other example of an unambiguous LALR(1) compatible BNF definition. (I also tried moving to other parser generators like ANTLR, which I was able to get to work, but which was way too slow).

If the input code is using ES6+ syntax, how would this work? Our beaver parser would throw syntax errors and complain. How could it possibly build an AST of ES6 input? Our existing testing suite (unit and performance) does not have any ES6 syntax in it, that's why this didn't complain. Our old parser didn't support ES6 so we had no ES6 code to test with.

The original branch/PR moving to the Graal parser included some changes to the parser source itself to get things to work as much as I could in our existing infrastructure (I cannot recall the exact changes I had to make, probably round fixing positions reported for tokens and handling comments better). I presume we'll need to continue to extend it to override the default recover implementation to allow for more flexible recovery mechanism that match closer to our old Beaver parser. It looks like they "recover" by just moving on to the next logically good statement start. Our beaver parser had a series of token insertions it would attempt to "fix" the code.

Sorry for the huge PR, I think all we need to care here is https://github.com/aptana/studio3/blob/ecef095dde51cabfd5265864ca2f5ad46c86339f/bundles/com.aptana.js.core/src/com/aptana/js/core/parsing/JSParser.java#L400-L405

And, added test cases for ES6 and fixing existing test cases https://github.com/aptana/studio3/pull/485/commits/ecef095dde51cabfd5265864ca2f5ad46c86339f https://github.com/aptana/studio3/pull/485/commits/b37bc3a66dccd12b06a921f09d58be8eddd57362

This is just an experimental proposal to quickly check how best we can use Graal with the Beaver Parser. I created on the master branch since this is having the clean beaver parser without our recent changes

I tried implementing token inserting recovery strategies which are similar to beaver parser, but it didn't work out well since both are having different parsing designs. I still need to figure it out how it has to be implemented.

Yes, you're right. If we provide ES6+ code to the beaver parser - it’s throwing syntax errors and AST(not ES6 based). But will ignore the beaver compiler errors and run the Graal to get the syntax errors so that we are getting existing functionality up front and won’t be able to see any errors for a valid ES6+ code.

The basic idea here is if recovery strategy implementation takes time, can we push this in the Nov release?

ES6 based class example:

class Animal { 
  constructor(type, sound) {
    this.type = type;
    this.sound = sound;

     //Ex: I want to get the proposals here
    Math.
  }}

Beaver AST for the same: {constructor(type, sound);{this.type = type;this.sound = sound;Math.;}}

Even though ES6 tokens are ignored we still get the AST, and that helps to get the proposals.

Fixed content assist issues by adding a simple parser recovery on graal. Hence, closing this.

aptana / studio3

Using GraalJS for compiling and Beaver for AST to support ES6+ and all other editor features #485