This PR extends happy by a Recursive Ascent-Descent backend and therefore implements #167.

In particular, the following options are added:

--cb-lalr: use happy's LALR states to create code in well-typed, continuation-based style. This style is due to Hinze and Paterson as described in Derivation of a Typed Functional LR Parser.
--cb-rad: produce a recursive ascent-descent (RAD) parser, also using well-typed, continuation-based style. The RAD states are generated based on happy's LALR states.
--cb-rad --optims: optimize the generated code for speed by employing some optimizations.
--types, --comments: annotate all functions with their explicit type (if possible) / with a comment. Sometimes, explicit types are required in order for GHC to build the generated parser.

In the following, I'll summarize the intention behind using recursive ascent-descent parsing, how it interacts with happy, and what advantages it brings.

TL;DR:

RAD (or GLC) parsing takes advantage of redundancies in LALR states to reduce the number of states, and the number of actions per state, significantly.
In combination with the directly-executable, continuation-based form, the parser is faster than a table-based variant.
Replacing the current, happy-generated, Haskell parser of the Glasgow Haskell Compiler with a recursive ascent-descent variant speeds up parsing of Haskell files by GHC by around 10%.

In smaller grammars (e.g. JSON), recursive ascent-descent parsers beat happy-generated ones by a factor of 3x to 4x.

The remainder is structured as follows:

What is Recursive Ascent-Descent?
Implementation
Compatibility with existing happy-features
Speed and size comparison
TBD

What is Recursive Ascent-Descent?

A recursive descent parser contains one parsing routine per nonterminal. This routine decides, based on the lookahead token, which production to follow. Then, all symbols of this production's right-hand side are parsed consecutively.

In contrast, LALR parsers can pursue multiple productions at once. An LALR state can have multiple core items, all of which describe a possible position inside a production in which the parser could currently be.

LALR parsers are very often found in table-based form. There also exists recursive ascent – similar to recursive descent, a recursive ascent parser has one function per state. Shift and goto actions are executed by calling a different (or the same) state function, while reduce actions pop the function call stack after calling a semantic action.

Because of its direct, function-based nature, a recursive ascent parser is often faster than a respective table-based parser. This comes with a burden: The compiled code is much larger. As an example, the table-based happy-generated LALR parser of GHC is 2.5 MB. A recursive-ascent form is a little faster, but is 33 MB large.

Now comes recursive ascent-descent: By merging and reusing similarities of multiple LALR states, the remaining states can be made smaller or partly unnecessary. This works by splitting each production at its recognition point: Up to the recognition point, parsing proceeds bottom-up using the RAD states, similar to LALR parsing. Once the recognition point of a rule is reached, it can be decided – based on the lookahead token – whether the rule is active or not. Remember: in an SLL grammar, this can be decided at the very beginning of each rule. After deciding that a rule is active, its remaining symbols are parsed consecutively, in a top-down fashion.

Bottom-up is only used while required, and the switch to top-down happens as early as possible.

Consider a nonterminal Value which appears in the right-hand side of multiple rules. Every time it appears after the recognition point, this Value parse proceeds identically. In an LALR parser, each of these parsing situations would require separate state sequences, whereas a RAD parser extracts them into a single procedure – parseValue. This procedure begins by entering into an entry state – the entry state of Value. Then, bottom-up parsing proceeds until the Value has fully been parsed and control is passed back to the calling rule procedure.

What is this good for? Because of the great savings of states and actions, a recursive ascent-descent parser is much smaller than a respective recursive ascent parser. Take another look at GHC's Haskell parser:

	table-based (happy-generated)	recursive ascent (LALR)	recursive ascent-descent (RAD)
Size of Parser.o	2.4 MB	33 MB	7.6 MB

The RAD parser is, of course, still larger than the status-quo, table-based, happy-generated parser, but not by a factor of 14 but only of 3. This goes along with a speedup of around 10%, which comes through the continuation-based form that we have chosen for code generation.

For more detailed explanations of what RAD is, how it works, how RAD states are generated from LALR states, how recognition points are found, how continuation-based code is generated and used in conjunction with happy, and for performance comparisons with LALR parsers, I refer to my bachelor's thesis: A Typed Recursive Ascent-Descent Backend for Happy.

Originally, recursive ascent-descent parsing was described by Horspool and builds upon Generalized Left Corner parsing.

Implementation

We shortly talk about how we implemented the new options.

--cb-lalr: uses happy's LALR states to create code in well-typed, continuation-based style. This code is often faster than code generated by happy, but much larger.
1. happy is executed normally, until conflict reporting and info file creation.
2. Then, we branch off into RADCodeGen_LALR.hs to generate the code.
--cb-rad: creates a RAD parser in well-typed, continuation-based style. This code is often faster than code generated by happy, and not much larger.
1. happy is executed normally, until conflict reporting and info file creation.
2. happy's LALR states are used to determine the recognition points of all productions.
3. RAD states are created based on happy's LALR states (in RADStateGen.hs). This process respects features like happy's conflict resolution and the error token. It is described in detail here.
4. Finally, code is created similar as in the --cb-lalr case (RADCodeGen.hs). Code generation is described in detail here.
--cb-rad --optims: we employ further optimizations to further speed up the generated parser. Currently, these include rule function inlining and eta-expansion of continuation functions.

Compatibility with existing `happy`-features

Our backend supports most of happy's features and directives. It supports all options that are used in GHC's Haskell grammar and is therefore powerful enough to produce a Haskell parser.

Especially, because happy's LALR states (i.e. action and goto tables) are used for RAD state generation, conflict resolution happens just as before, and directives like %left or %expect work as desired.

Monadic parsers and lexers are supported – the generated code is adapted to the monadic style. Partial parsers are also supported, just like the error token.

Features that are not (yet) supported are:

Using %monad without %lexer, or vice versa. It doesn't seem anyone is doing this, therefore we didn't really care about this case.
{%^ ... } and {%% ... } semantic actions. If these are desired, they could be implemented. Of course, normal monadic actions ({% ... }) are supported.
Attribute grammars.
GLR parsing. This is, of course, incompatible with RAD parsing.
happy-specific options like -a, -c, and -g have no equivalent in continuation-based code. We are however currently working on a --strict implementation.

Speed And Size Comparison

The detailed results can be seen here. We just briefly look at two results now: small grammars, and GHC.

First we look at two small grammars – an expression grammar and a JSON grammar. We compare the result of the fastest happy-generated parser (which could be happy, happy -acg or happy -acg --strict) with our --cb-rad and --cb-rad --optims parsers. The following table shows the average plain parsing times (using a large input file), without lexing.

Grammar	fastest `happy`	`--cb-rad`	`--cb-rad --optims`
Expression	90.0 ms	35.7 ms	28.0 ms
JSON	61.1 ms	16.1 ms	15.3 ms

Our parser beats the happy-generated ones by a factor of 3-4.

Now we consider GHC. We built GHC 8.10.1 with its normal happy-generated parser, and with a Parser.hs in RAD-form. Then we used -ddump-timings to obtain the parsing times when parsing Haskell files. We parsed four large files. Each time we compared both the stage-1 and the stage-2 parser.

File	Stage	status quo (`happy -acg --strict`)	`--cb-rad --optims`	Speedup
1	1	3780 ms	3051 ms	19.4%
1	2	3459 ms	3162 ms	8.6%
2	1	60.1 ms	55.5 ms	7.7 %
2	2	47.0 ms	45.8 ms	2.5%
3	1	432 ms	387 ms	11.4%
3	2	404 ms	382 ms	5.5%
4	1	784 ms	680 ms	13.3%
4	2	748 ms	713 ms	4.9%

We believe that we can expect even higher speed-ups by employing further optimizations. Especially, there are some performance regressions from stage-1 to stage-2 using our RAD parsers (file 1 and file 4) that do not happen for the happy-generated parsers. Getting rid of these regressions would mean higher speedups.

Finally here is the above table again comparing the Parser.o sizes:

	status quo	`--cb-lalr`	`--cb-rad --optims`
Size of Parser.o	2.4 MB	33 MB	7.6 MB

TBD

Some time, before or after merging this PR, some things must still be done:

[ ] Update README.
[ ] Update documentation; maybe explain RAD parsing (and continuation-based parsing) in its own chapter.
[ ] Add examples and test cases.
[ ] Generate a RAD-specific info file on -i. Currently, no RAD-specific information is given to the user (except when they use the --comments flag and look at the generated code).
[ ] Further size and performance optimizations, for example:
- [ ] Implement --strict flag
- [ ] Use short-cut fusion to optimize away repeatTok/lexerWrapper call pairs
[ ] Fix performance regressions from stage-1 to stage-2 GHC parsers.
[ ] Try reducing the compilation time of generated parsers. They can currently take a long time to compile.

haskell / happy

Implement a typed recursive ascent-descent backend #174

What is Recursive Ascent-Descent?

Implementation

Compatibility with existing `happy`-features

Speed And Size Comparison

TBD

haskell / happy

Implement a typed recursive ascent-descent backend #174

What is Recursive Ascent-Descent?

Implementation

Compatibility with existing happy-features

Speed And Size Comparison

TBD

Compatibility with existing `happy`-features