es-meta / esmeta

ECMAScript Specification (ECMA-262) Metalanguage
BSD 3-Clause "New" or "Revised" License
175 stars 12 forks source link

Implement trie for parser performance improvement #196

Closed d01c2 closed 8 months ago

d01c2 commented 9 months ago

This PR includes:

1. Trie implementation

Background While fixing some errors for updating to es2023, we found an error when we parse function f({x,}) {}. This problem occurred when parsing the first parameter {x,} with ObjectBindingPattern. Given that the spec contains both , and ,} patterns, we resolved the error by making an exception for the case of ,} as following:

  // a terminal lexer
  protected val TERMINAL: Lexer = grammar.prods.foldLeft[Parser[String]]("") {
    case (parser, Production(lhs, _, _, rhsList)) =>
      rhsList.foldLeft(parser) {
        case (parser, Rhs(_, tokens, _)) =>
          tokens.foldLeft(parser) {
            case (parser, Terminal("?.")) => parser ||| ("?." <~ not("\\d".r))
            case (parser, Terminal(",}")) => parser  // <- Here
            case (parser, Terminal(t))    => parser ||| t
            case (parser, _)              => parser
          }
      }
  }

(from esmeta/src/main/scala/esmeta/parser/ESParser.scala)

Issue During this process, we found that our logic was quite inefficient. When we construct a parser by combining the longest match from each parser, the cost of creating each parser could become a concern.

Solution Rather than using String type as a type of token, it will be much more efficient when we use Set[String] as a type of token. In this method, we can reduce the number of parser construction. To utilize this approach, we believed that it could be implemented by using trie. Since there isn't a built-in trie available, I implemented it.

d01c2 commented 9 months ago

I just added simple test cases for checking whether new parser works well. You can test by running parse_test.sh in $ESMETA_HOME.

The contents of those simple tests are following: $ESMETA_HOME/tests/sample1.js: function f({x,}) {} $ESMETA_HOME/tests/sample2.js: function f({x, }) {} $ESMETA_HOME/tests/output_1.txt: output of esmeta parse $ESMETA_HOME/tests/sample1.js $ESMETA_HOME/tests/output_2.txt: output of esmeta parse $ESMETA_HOME/tests/sample2.js

Please let me know if more diverse test cases are needed for the parser implementation.

jhnaldo commented 8 months ago

I successfully applied Trie implementation in ESParser to improve its performance and checked that it passed all the Test262 tests.