Parslet is a great library (thanks for your work!) but it's also VERY memory hungry. While using parslet for parsing bank account statements in mt940 format I observed memory usage of up to 1,6 GB for an input file of less than 3 MB.
One of the problems is the overhead of having to use many atoms to describe something that could easily be combined with a single regular expression. I'm not sure why you've chosen to match only a single character at a time with match but it makes things unnecessarily complex.
This PR extends the match atom by accepting any regular expression and removing the single character limit.
When I simplified the mt940 parser with this, memory usage dropped by 66%.
Parslet is a great library (thanks for your work!) but it's also VERY memory hungry. While using parslet for parsing bank account statements in mt940 format I observed memory usage of up to 1,6 GB for an input file of less than 3 MB.
One of the problems is the overhead of having to use many atoms to describe something that could easily be combined with a single regular expression. I'm not sure why you've chosen to match only a single character at a time with
match
but it makes things unnecessarily complex.This PR extends the
match
atom by accepting any regular expression and removing the single character limit.When I simplified the mt940 parser with this, memory usage dropped by 66%.