This borrows a really cunning idea I remember hearing at Terence Parr's ALL(*) talk. He said that the basic idea is "when an LL parser gets to a point of ambiguity, what happens if it magically knew which possibility to take?" As I remember it, ANTLR fires up sub-parsers that try all the possibilities; those are filtered down to a single possibility, which the LL parser then takes. Thus the "main" parser doesn't have to have the slow stuff that deals with ambiguity (which is devolved to somewhat rarely run sub-parsers).
What this commit does is to turn off parse tree creation while it's trying to find valid repairs. Once a valid repair has been selected, the repair actions are replayed with parse tree creation turned on. Since we try tens of thousands of repair candidates, avoiding the memory overhead of parse tree creation is significant. With files that require complex repairs this commit is a 10x saving in time (and, I think, an even greater saving in memory, though it's hard to measure accurately).
This borrows a really cunning idea I remember hearing at Terence Parr's ALL(*) talk. He said that the basic idea is "when an LL parser gets to a point of ambiguity, what happens if it magically knew which possibility to take?" As I remember it, ANTLR fires up sub-parsers that try all the possibilities; those are filtered down to a single possibility, which the LL parser then takes. Thus the "main" parser doesn't have to have the slow stuff that deals with ambiguity (which is devolved to somewhat rarely run sub-parsers).
What this commit does is to turn off parse tree creation while it's trying to find valid repairs. Once a valid repair has been selected, the repair actions are replayed with parse tree creation turned on. Since we try tens of thousands of repair candidates, avoiding the memory overhead of parse tree creation is significant. With files that require complex repairs this commit is a 10x saving in time (and, I think, an even greater saving in memory, though it's hard to measure accurately).