Speed up parsing - Githubissues

FuegoFro commented 5 years ago

This makes a few improvements to make parsing of large projects fairly significantly faster. On my machine, on an example project (which is about 10 MB) it goes from taking 5.9 seconds to parse (fastest of 10 runs) to taking 1.8 seconds, a 3x speedup. Note that I calculated minimum for how fast we could reasonably parse a project of this size by looping over each character, checking if it is "a" and incrementing a counter. This baseline minimum took 0.9 seconds, so we're much closer to that now.

The basic strategies taken here were to speed up padding/whitespace/comment parsing by using a frozenset to check whether a character was one of the ones we care about (which is significantly faster than chaining or'd equalities), inlining methods (particularly those that are only used once or a called the most, such as _ignore_whitespace and _ignore_comment), and pulling string values from the input string as one, rather than building them one character at a time.

FuegoFro commented 5 years ago

Hey! Just wanted to check in and see if you had any more thoughts on this change 🙂

FuegoFro commented 4 years ago

Thank you for the merge! 🎉 😀

kronenthaler / openstep-parser

Speed up parsing #20