It takes a bit too long to parse a large code base. Profiling has shown the bottleneck is in the generated parser code (as opposed to the preprocessor/scanner).
Need to do one of the following (in order of increasing effort):
write more clever grammar for the problematic rules
rewrite in Go or C++ (not that difficult)
concurrent parsing
The latter would be doable, but would require some thought on how to avoid compilation-order issues.
Another solution would be to accept a low-ish performance on the first parse but cache the results and subsequently only parse the files which have changed. This would still require a re-parse of files which tick-include the changed files however.
Turns out the only thing that should be necessary is to fix the grammar. It has a lot of ambiguities and such. Once that is done I can also try a two-pass strategy as mentioned in the ANTLR book.
It takes a bit too long to parse a large code base. Profiling has shown the bottleneck is in the generated parser code (as opposed to the preprocessor/scanner).
Need to do one of the following (in order of increasing effort):
The latter would be doable, but would require some thought on how to avoid compilation-order issues.
Another solution would be to accept a low-ish performance on the first parse but cache the results and subsequently only parse the files which have changed. This would still require a re-parse of files which tick-include the changed files however.