Closed jl6 closed 1 year ago
Yes, earley parser has O(n^3) scaling in runtime (and probably also in space, although I am not actually sure about that). Use parser="lalr"
instead. If that doesn't work, you will need to rework your grammar to work with the LALR(1) algorithm, which should defeinetly be possible for CSV. earley just isn't really designed for such large inputs.
But actually, you should just be using a custom purpose parser for a format as simple as CSV, for example the stdlib csv
module.
Thanks for the quick response MegaIng.
I am attempting to build a CSV parser based on RFC 4180:
Lark quickly and successfully builds this parser, and it appears to correctly parse small CSV files. However, for larger files (still not very large in the scheme of things - say, 600KB), the runtime increases greatly, and memory usage balloons to multiple gigabytes, until memory is exhausted and the process is killed.