pypa / readme_renderer

Safely render long_description/README files in Warehouse
Apache License 2.0
158 stars 88 forks source link

[patch] support python3.9 by using unescape() from html_parser (python3 only) #175

Closed sandrotosi closed 3 years ago

sandrotosi commented 3 years ago

Hello, python3.9 removed html.parser.HTMLParser.unescape() (just scroll above that reference), so markdown.py will need to be updated

in Debian, i wrote a patch that does:

--- a/readme_renderer/markdown.py
+++ b/readme_renderer/markdown.py
@@ -99,7 +99,7 @@ def _highlight(html):
         # translate '"' to '"', but it confuses pygments. Pygments will
         # escape any html entities when re-writing the code, and we run
         # everything through bleach after.
-        code = html_parser.HTMLParser().unescape(code)
+        code = html_parser.unescape(code)

         highlighted = pygments.highlight(code, lexer, formatter)

as in python3 unescape() is available from six.modes.html_parser, but that doesnt work on python2.7, which is still supported by this project:

$ python2.7 -c "from six.moves import html_parser ; html_parser.unescape('')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute 'unescape'

for Debian, this is fine, since we're phasing out python2.7, but if you still want to support 2.7 and add support for 3.9, you may want to check the interpreter version, as i'm afraid six wont help much (i mean, you can use six.PY2|3 but that's about it)