seth-brown / formd

A Markdown formatting tool
MIT License
147 stars 19 forks source link

RegEx Suggestion #13

Closed fractaledmind closed 10 years ago

fractaledmind commented 10 years ago

The regular expression for finding links seems to be a bit loose.

self.match_links = re.compile(r"""(\[[^^]*?\])\s?             # text
                              (\[.*?\]|\(.*?\r?\n?.*?\)\)?)   # ref/url
                               """, re.MULTILINE | re.X)

For example, markdown text like this:

This morning, a friend noted a discrepancy between two recent headlines at The Mac Observer:

+ March 5: "[Apple CFO Peter Oppenheimer to Retire, Luca Maestri to Take Over][tmo1]"
+ May 7: "[PR Queen Katie Cotton Leaving Apple][tmo2]"

[tmo1]\: http://www.macobserver.com/tmo/article/apple-cfo-peter-oppenheimer-to-retire-luca-maestri-to-take-over
[tmo2]\: http://www.macobserver.com/tmo/article/pr-queen-katie-cotton-leaving-apple

[I tweeted the two headlines and corresponding URLs][t], with a single word of commentary: "Hmm". I said no more partly because I was near the 140-character limit, and partly to see what the reaction would be. Some got it, but many repliers missed my point, mistakenly thinking it was related to an exodus of executives from the company.<sup id="fnr1-2014-05-08">[1]</sup>

My point was to draw attention to the disparate job descriptions: "Apple CFO" vs. "PR Queen".

[Julia Richert pointed to][j] a similar discrepancy -- two Philip Elmer-DeWitt headlines on his weblog at CNN/Fortune/Money

will match items thus:

To tighten up the regex, I suggest the following change, which appropriately catches all markdown links, but only markdown links:

self.match_links = re.compile(r"""(\[[^^\[\]:]*?\])\s?? # text
                              (\[[^\[\]:]*?\]|          # ref
                              \(.*?\r?\n?.*?\)\)?)      # url
                              """, re.MULTILINE | re.X)

Clearly, the only real change is to add to the characters that cannot be within the linked text or in the reference id.

seth-brown commented 10 years ago

Thanks for taking the time to describe the problem and for suggesting a solution. I've merged a fix for the issue you've described. The latest version of ForMd should fix the problem. Let me know if you have any further issues. Thanks again!