miyuchina / mistletoe

A fast, extensible and spec-compliant Markdown parser in pure Python.
MIT License
791 stars 110 forks source link

Don't convert the quotes #176

Closed barthelemypousset closed 1 year ago

barthelemypousset commented 1 year ago

Hello,

I would simply want the quotes not converted when using mistletoe: Here's what happen

>>> mistletoe.markdown('"test"')
'<p>&quot;test&quot;</p>\n'

And I would like the result to simply be '<p>"test"</p>\n' Do you know how I could do that ?

Thank you

pbodnar commented 1 year ago

@barthelemypousset, there's no easy way how to change this in one place currently, you would have to change this in concrete calls to html.escape() within the HTMLRenderer class.

mistletoe historically escapes quotes (") and since version 0.9.0 we also escape apostrophes ('; see #115 and ef9bd3afc1e69b23b9238b2eeee45adec627adc7). But I can see this doesn't always have to be desired, so I think it would be cool to have the escaping as an option (or options) in a future version of mistletoe. Now 1.0.0 is about to be released, so I would plan this to 1.1.0.

Cc @anderskaplan. :)

pbodnar commented 1 year ago

@barthelemypousset (and @anderskaplan), something like in the created PR draft?

Regarding making the escaping of both double and single quotes disabled by default (not for attributes though, of course!), I'm not personally against that. I always prefer simpler things if there is no evident con (and "something" is definitely more readable and shorter than &quot;something&quot;). It would just require I hope a simple change to the commonmark.py test file, so that tests still pass.

pbodnar commented 1 year ago

... It would just require I hope a simple change to the commonmark.py test file, so that tests still pass.

So I've got this prepared, but just for sure, I will probably disable the double quotes escaping in the next release version after 1.1.0. Here is the diff (which would follow after the linked PR):

diff --git a/mistletoe/html_renderer.py b/mistletoe/html_renderer.py
index 985490e..565d37e 100644
--- a/mistletoe/html_renderer.py
+++ b/mistletoe/html_renderer.py
@@ -18,7 +18,7 @@ class HTMLRenderer(BaseRenderer):

     See mistletoe.base_renderer module for more info.
     """
-    def __init__(self, *extras, html_escape_double_quotes=True, html_escape_single_quotes=False):
+    def __init__(self, *extras, html_escape_double_quotes=False, html_escape_single_quotes=False):
         """
         Args:
             extras (list): allows subclasses to add even more custom tokens.
diff --git a/test/specification/commonmark.py b/test/specification/commonmark.py
index f3adda2..3f17c03 100644
--- a/test/specification/commonmark.py
+++ b/test/specification/commonmark.py
@@ -1,7 +1,7 @@
 import re
 import sys
 import json
-from mistletoe import markdown
+from mistletoe import Document, HTMLRenderer
 from traceback import print_tb
 from argparse import ArgumentParser

@@ -34,7 +34,8 @@ def run_tests(test_entries, start=None, end=None,
 def run_test(test_entry, quiet=False):
     test_case = test_entry['markdown'].splitlines(keepends=True)
     try:
-        output = markdown(test_case)
+        with HTMLRenderer(html_escape_double_quotes=True) as renderer:
+            output = renderer.render(Document(test_case))
         success = test_entry['html'] == output
         if not success and not quiet:
             print_test_entry(test_entry, output)