stchang / parsack

A basic Parsec-like monadic parser combinator library implementation in Racket.
MIT License
50 stars 10 forks source link

Memoization? #33

Closed greghendershott closed 10 years ago

greghendershott commented 10 years ago

Background: My markdown parser is vulnerable to things like a series of open HTML tags without matching closing ones. It parses to EOF, fails, backtracks. In which case the run time can be exceptionally bad.

Question: Do you think there's any opportunity to use memoization as well as lazy evaluation? IIUC they're orthogonal.

stchang commented 10 years ago

Short answer, yes there are many possible optimizations. Memoization is one possibility. Another is pruning the tree of possible backtracking paths. If you look at the docs for syntax-parse patterns you can find more optimization strategies, like ~commit or ~delimit-cut.

I don't have time to investigate immediately but if you want to try it out I'm in full support.

greghendershott commented 10 years ago

Thanks for the reply. By the way, I wasn't asking for you to do this urgently, or even for you do it at all. I tagged it "question" hoping to convey that, but I'm sorry it wasn't clear.

I did try some quick/naive memoization, which backfired; the memoizing overhead was higher than the savings.

Thanks for the link, I'll check that out.

stchang commented 10 years ago

No worries, I just felt like I was slacking on maintenance duties.

Matt Might has a 2011 paper involving parsing and memoization. It's not exactly the same context but maybe some optimization ideas are transferable.

edit: It may also be helpful to check out the code here.