pingcap / parser

A MySQL Compatible SQL Parser
Apache License 2.0
1.4k stars 490 forks source link

Is it possible for introduce ANTLR ? #998

Open LENSHOOD opened 4 years ago

LENSHOOD commented 4 years ago

Feature Request

Recently when I was doing tpc-c testing and profiling by using go-tpc with go-pprof, I've found that the Parser component takes significant CPU usage and heap space.

The flame graph illustrates the parser.yyParse function occupied the most CPU usage at tpcc prepare stage (many INSERT operations).

Maybe it is not an easy job to optimize parser.yyParse since the whole parser.go was generated by go-yacc.

Therefore, I'm wondering is there any possibility to introduce ANTLR as a replacement of go-yacc ?

Here is some advantages of ANTLR:

  1. According to Terence Parr‘s - the author of ANTLR - paper: Adaptive LL(*) Parsing: The Power of Dynamic Analysis, ANTLR4's core algorithm ALL(*) has better performance than LR(1) .
  2. As of ANTLR version 4.6, ANTLR provide go runtime supported, which is more compatible with current Parser code base.
  3. Rather than go-yacc, ANTLR has more actively community and clearly new feature develop plan.

Disadvantages:

  1. Hard to migrate due to quite different grammar file format.
  2. Not yet any paper has prove the performance of ANTLR is better than go-yacc.
kennytm commented 4 years ago

Disadvantages:


if we wanted to switch parser generators from go-yacc (LALR(1)), we should also consider PEG and GLR/GLL. and in the end a benchmark should dictate which parser generator should be used. perhaps, as the first step, we could make the parser backend switchable to simplify experimentation.

hhb commented 3 years ago

Found a Bison to ANTLR converter written in C#. Maybe useful? https://github.com/kaby76/AntlrExamples/tree/master/Bison

The implementation is not complicated. Maybe it can be extended to generate a listener class to output AST? Its license is unclear though.

zjcxc commented 1 year ago

mysql workbench use ANTLR