Open mewmew opened 7 years ago
For grammars containing many production rules the shift/reduce tables tend to blow up in size. As an example, the LLVM IR grammar contains about 750 production rules (the full language may contain approximately 1000 production rules).
The generated action and goto tables of the parser are huge if uncompressed (i.e. not using -zip
).
u@x220 ~/D/g/s/g/l/l/a/i/parser> du -hs actiontable.go gototable.go
91M actiontable.go
57M gototable.go
u@x220 ~/D/g/s/g/l/l/a/i/parser> wc -l actiontable.go gototable.go
3108914 actiontable.go
2860201 gototable.go
5969115 total
The source code of these tables is almost 150 MB in total and contains almost 6 million lines of code.
For much larger grammars than this, using uncompressed tables starts to be infeasible.
https://github.com/goccmack/gocc/issues/28 tried to mitigate some of these issues by compressing the tables. While this works great, both for compilation speed and file size, some issues remain. First of, the runtime memory use of the uncompressed tables is still large. And secondly, it is not possible to track code coverage of production rules, when these tables are compressed, as a test case giving coverage for a single production rule, will give code coverage to all production rules (since the tables occupy a single line in the source code).
As motivated by Marius in https://github.com/goccmack/gocc/issues/13#issuecomment-181827331