Memoization? - Githubissues

greghendershott commented 10 years ago

Background: My markdown parser is vulnerable to things like a series of open HTML tags without matching closing ones. It parses to EOF, fails, backtracks. In which case the run time can be exceptionally bad.

Question: Do you think there's any opportunity to use memoization as well as lazy evaluation? IIUC they're orthogonal.

lazy postpones evaluation, which helps Parsack in cases where it won't actually need the result.
Given try (and notFollowedBy, lookAhead, and friends), it seems like there will be times where (some-parser some-state) will be evaluated multiple times. In that case, memoization ought to help, since the result is needed, and needed repeatedly.
1. Do you think memoization would help?
2. Could this be done largely inside parsack.rkt? Places that do (parser state) applications could be replaced with (say) (memo-apply parser state) without otherwise altering the structure in there, nor of callers.

stchang commented 10 years ago

Short answer, yes there are many possible optimizations. Memoization is one possibility. Another is pruning the tree of possible backtracking paths. If you look at the docs for syntax-parse patterns you can find more optimization strategies, like ~commit or ~delimit-cut.

I don't have time to investigate immediately but if you want to try it out I'm in full support.

greghendershott commented 10 years ago

Thanks for the reply. By the way, I wasn't asking for you to do this urgently, or even for you do it at all. I tagged it "question" hoping to convey that, but I'm sorry it wasn't clear.

I did try some quick/naive memoization, which backfired; the memoizing overhead was higher than the savings.

Thanks for the link, I'll check that out.

stchang commented 10 years ago

No worries, I just felt like I was slacking on maintenance duties.

Matt Might has a 2011 paper involving parsing and memoization. It's not exactly the same context but maybe some optimization ideas are transferable.

edit: It may also be helpful to check out the code here.

stchang / parsack

Memoization? #33