Closed GoogleCodeExporter closed 8 years ago
confirmed the hang.
Original comment by tre...@gmail.com
on 1 Mar 2011 at 11:40
Core problem is a pathologically slow regex looking for a possible "<hr>":
re.compile(r"^[ ]{0,2}([ ]?\-[ ]?){3,}[ \t]*$", re.M)
Which with the "issue52_hang.text" input (recently commited, on Github) and
David's input file is attempting to match against a string like below.
{{{
import re
r = re.compile('^[ ]{0,2}([ ]?\\-[ ]?){3,}[ \\t]*$', re.M)
text = '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+\n\nPrivacy Policy: http://www.PetitionOnline.org/privacy-pets.html\n\n'
text = '- - - - - - - - - - - - - - - - - - - - - - - +\n\nfoo\n\n'
print(r.search(text))
}}}
This takes a looooong time... and increases exponentially? geometrically? badly
then the number of "- " segments increases.
A possible secondary problem is that "+ - - - - - - ..." is being parsed as a
listitem inside a list item inside a list item, etc. That seems unnecessary.
TODO:
- separate issue for the list item inside a list item thingy
- tighten-up test case for '<hr>' speed
- speed up '<hr>' match
Original comment by tre...@gmail.com
on 7 Mar 2011 at 7:22
Fixed in:
[master 9e99850] Fix issue 52. Tweak silly nest li matching. See CHANGES.txt
on github.com/trentm/python-markdown2. See "slow_hr", "not_quite_a_list" and
"hr_spaces" tests added around this.
Original comment by tre...@gmail.com
on 10 Mar 2011 at 4:36
Original issue reported on code.google.com by
david.as...@gmail.com
on 1 Mar 2011 at 11:13Attachments: