trentm / python-markdown2

markdown2: A fast and complete implementation of Markdown in Python
Other
2.66k stars 433 forks source link

Fix bad handling of consecutive bold words (#541) #545

Closed Crozzers closed 11 months ago

Crozzers commented 11 months ago

This PR aims to fix #541 while still protecting against #493.

Whilst attempting to fix a ReDOS, I changed the _strong_re to use a greedy pattern, which would consume the lengthy ReDOS input without spending too much processing time on it. However, this also caused the _strong_re to consume multiple bold words as one.

The fix here is to make the regex non-greedy again.

- (\*\*|__)(?=\S)(.*\S)\1
+ (\*\*|__)(?=\S)(.+?[*_]?)(?<=\S)\1

This should match the opening syntax followed by a non-whitespace char, one or more characters followed by an optional <em> closer, and then the closing syntax preceeded by a non-whitespace char.

nicholasserra commented 11 months ago

Thanks! Will prob do another release after this.