mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.52k stars 864 forks source link

Escape parenthesis in link URLs. Fix #459. #460

Closed pavelhoral closed 4 months ago

pavelhoral commented 4 months ago

This adds escaping parenthesis according to Common Mark spec - https://spec.commonmark.org/0.31.2/#link-destination.

Alternative solution would be to detect presence of parenthesis and enclosing the link in pointy brackets [text](<link>).

pavelhoral commented 4 months ago

I have specifically changed only the link destination escaping, however a quick glance on the link title specification reveals that this is also not correctly handled / escaped. So maybe this PR should cover that as well?

diff --git a/src/commonmark-rules.js b/src/commonmark-rules.js
index 8aa1617..f32a893 100644
--- a/src/commonmark-rules.js
+++ b/src/commonmark-rules.js
@@ -155,7 +155,7 @@ rules.inlineLink = {
     var href = node.getAttribute('href')
     if (href) href = href.replace(/([()])/g, '\\$1')
     var title = cleanAttribute(node.getAttribute('title'))
-    if (title) title = ' "' + title + '"'
+    if (title) title = ' "' + title.replace(/"/g, '\\"') + '"'
     return '[' + content + '](' + href + title + ')'
   }
 }
pavelhoral commented 4 months ago

I have added title escaping as well.