bhollis / maruku

A pure-Ruby Markdown-superset interpreter (Official Repo).
MIT License
502 stars 80 forks source link

Regexes to recognize tables are horrendously slow #92

Closed cleverdripper closed 11 years ago

cleverdripper commented 11 years ago

The regexes in mdline.rb:

Sep = /\s*(\:)?\s*-+\s*(\:)?\s*/
TableSeparator = /^(\|?#{Sep}\|?)+?\s*$/

Are extremely suboptimal for matching nonshort strings that don't match the regex. Maruku can take forever to process files if they have a line that triggers the pathological case for this regex in them. As an example, try this in irb:

Sep = /\s*(\:)?\s*-+\s*(\:)?\s*/
TableSeparator = /^(\|?#{Sep}\|?)+?\s*$/
'----------------------------------------------------x' =~ TableSeparator

or run Maruku on this markdown:

a | b
-----------------------------------------------------x

This times out my patience before finishing. I'm using ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-linux].

bhollis commented 11 years ago

Thanks for pointing this out. I have an item in my to-do list for optimizing the regexes Maruku uses, and I'll make sure to attack this one.

bhollis commented 11 years ago

The pathological regex has been fixed.