pypa / readme_renderer

Safely render long_description/README files in Warehouse
Apache License 2.0
158 stars 88 forks source link

Add support for HTML in Mark Down (Requests.Py) #167

Closed edison12a closed 4 years ago

edison12a commented 4 years ago

Describe the bug I wanted to file this as a requests issue but then I realized that since the library README shows well on github, Its expected to do the same on PyPi.

To Reproduce Requests Github Requests PyPI

So What can we do to make sure that there is more support for multiple MarkDown Stylings? I'm willing to help with that.

di commented 4 years ago

Trasferred to https://github.com/pypa/readme_renderer.

Right now the renderer only supports reStructuredText and Markdown. Supporting HTML would require updates here as well as to the Description-Content-Type core metadata specification: https://packaging.python.org/specifications/core-metadata/#description-content-type

nateprewitt commented 4 years ago

Hey Dustin, thanks for taking a look at this! Wanted to quickly chime in from Requests side. I think the doc in question is technically GitHub Flavored Markdown which supports a subset of HTML. The PyPI documentation says that's the default variant, so I'm not sure if there's a new gap in parsing. This had rendered correctly in our last upload in ~January~ February.

That said, the way our README is written is convoluted at best. We're planning on rewriting it with a more standard usage of markdown, so this isn't a blocker for us.

di commented 4 years ago

Nothing has changed w/ regards to the parsing. How has the long description changed since the previous release?

nateprewitt commented 4 years ago

These are the commits we've merged, they've all been small typo fixes:

Except for one that changed syntax highlighting comment on a code block:

That's the only markdown related change.

di commented 4 years ago

This appears to be an unfortunate confluence of edge cases.

First, you uploaded requests==2.24.0 with twine, which prefers uploading wheels before source distributions, and PyPI takes the metadata from the first upload, so the metadata is coming from the wheel.

The wheel you created for this release was built with wheel==0.24.0:

$ unzip requests-2.24.0-py2.py3-none-any.whl
Archive:  requests-2.24.0-py2.py3-none-any.whl
  inflating: requests/cookies.py
  inflating: requests/auth.py
  inflating: requests/sessions.py
  inflating: requests/hooks.py
  inflating: requests/compat.py
  inflating: requests/models.py
  inflating: requests/certs.py
  inflating: requests/__init__.py
  inflating: requests/status_codes.py
  inflating: requests/packages.py
  inflating: requests/__version__.py
  inflating: requests/api.py
  inflating: requests/_internal_utils.py
  inflating: requests/utils.py
  inflating: requests/exceptions.py
  inflating: requests/structures.py
  inflating: requests/help.py
  inflating: requests/adapters.py
  inflating: requests-2.24.0.dist-info/DESCRIPTION.rst
  inflating: requests-2.24.0.dist-info/LICENSE.txt
  inflating: requests-2.24.0.dist-info/metadata.json
  inflating: requests-2.24.0.dist-info/top_level.txt
  inflating: requests-2.24.0.dist-info/WHEEL
  inflating: requests-2.24.0.dist-info/METADATA
  inflating: requests-2.24.0.dist-info/RECORD

$ cat requests-2.24.0.dist-info/WHEEL
Wheel-Version: 1.0
Generator: bdist_wheel (0.24.0)
Root-Is-Purelib: true
Tag: py2-none-any
Tag: py3-none-any

This version of wheel was released July 6, 2014 and predates PEP 566 which introduced Description-Content-Type, which wasn't available until wheel==0.31.0.

As a result, the metadata for your wheel is version 2.0, not version 2.1. Even though it contains Description-Content-Type:

$ head -n 50 requests-2.24.0.dist-info/METADATA
Metadata-Version: 2.0
Name: requests
Version: 2.24.0
Summary: Python HTTP for Humans.
Home-page: https://requests.readthedocs.io
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Project-URL: Source, https://github.com/psf/requests
Project-URL: Documentation, https://requests.readthedocs.io
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
Description-Content-Type: text/markdown
Provides-Extra: security
Provides-Extra: socks
Requires-Dist: chardet (>=3.0.2,<4)
Requires-Dist: idna (>=2.5,<3)
Requires-Dist: urllib3 (!=1.25.1,<1.26,!=1.25.0,>=1.21.1)
Requires-Dist: certifi (>=2017.4.17)
Provides-Extra: security
Requires-Dist: pyOpenSSL (>=0.14); extra == 'security'
Requires-Dist: cryptography (>=1.3.4); extra == 'security'
Provides-Extra: socks
Requires-Dist: PySocks (!=1.5.7,>=1.5.6); extra == 'socks'
Provides-Extra: socks
Requires-Dist: win-inet-pton; sys_platform == "win32" and python_version == "2.7" and extra == 'socks'

<span align="center">

<pre>
    <a href="https://requests.readthedocs.io/"><img src="https://raw.githubusercontent.com/psf/requests/master/ext/requests-logo.png" align="center" /></a>

    <div align="left">
    <p></p>
    <code> Python 3.7.4 (default, Sep  7 2019, 18:27:02)</code>

The pkginfo library will not read it since it did not exist in metadata version 2.0:

$ pkginfo requests-2.24.0-py2.py3-none-any.whl
metadata_version: 2.0
name: requests
version: 2.24.0
platforms: ['UNKNOWN']
summary: Python HTTP for Humans.
description:
...
home_page: https://requests.readthedocs.io
author: Kenneth Reitz
author_email: me@kennethreitz.org
license: Apache 2.0
classifiers: ['Development Status :: 5 - Production/Stable', 'Intended Audience :: Developers', 'Natural Language :: English', 'License :: OSI Approved :: Apache Software License', 'Programming Language :: Python', 'Programming Language :: Python :: 2', 'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: Implementation :: CPython', 'Programming Language :: Python :: Implementation :: PyPy']
requires_python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
requires_dist: ['chardet (>=3.0.2,<4)', 'idna (>=2.5,<3)', 'urllib3 (!=1.25.1,<1.26,!=1.25.0,>=1.21.1)', 'certifi (>=2017.4.17)', "pyOpenSSL (>=0.14); extra == 'security'", "cryptography (>=1.3.4); extra == 'security'", "PySocks (!=1.5.7,>=1.5.6); extra == 'socks'", 'win-inet-pton; sys_platform == "win32" and python_version == "2.7" and extra == \'socks\'']
project_urls: ['Source, https://github.com/psf/requests', 'Documentation, https://requests.readthedocs.io']

By default, if the Description-Content-Type is not provided in the metadata, PyPI defaults to attempting to render the description as reStructuredText, which was the default before content types exist. If it can't render it, it will fail the upload.

Amazingly, the description provided in this wheel is somehow valid reStructuredText:

$ python
Python 3.8.2 (default, Apr 22 2020, 21:21:01)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pkginfo import Wheel
>>> import readme_renderer.rst
>>> wheel = Wheel('requests-2.24.0-py2.py3-none-any.whl')
>>> readme_renderer.rst.render(wheel.description)
'<p>&lt;span align=”center”&gt;</p>\n<dl>\n<dt>&lt;pre&gt;</dt>\n<dd><p>&lt;a href=”<a href="https://requests.readthedocs.io/" rel="nofollow">https://requests.readthedocs.io/</a>”&gt;&lt;img src=”<a href="https://raw.githubusercontent.com/psf/requests/master/ext/requests-logo.png" rel="nofollow">https://raw.githubusercontent.com/psf/requests/master/ext/requests-logo.png</a>” align=”center” /&gt;&lt;/a&gt;</p>\n<p>&lt;div align=”left”&gt;\n&lt;p&gt;&lt;/p&gt;\n&lt;code&gt; Python 3.7.4 (default, Sep  7 2019, 18:27:02)&lt;/code&gt;\n&lt;code&gt; &gt;&gt;&gt; &lt;strong&gt;import requests&lt;/strong&gt;&lt;/code&gt;\n&lt;code&gt; &gt;&gt;&gt; r = requests.get(‘<a href="https://api.github.com/repos/psf/requests" rel="nofollow">https://api.github.com/repos/psf/requests</a>’)&lt;/code&gt;\n&lt;code&gt; &gt;&gt;&gt; r.json()[“description”]&lt;/code&gt;\n&lt;code&gt; ‘A simple, yet elegant HTTP library.’&lt;/code&gt;\n&lt;/div&gt;</p>\n<dl>\n<dt>&lt;p&gt;</dt>\n<dd>This software has been designed for you, with much joy,\nby &lt;a href=”<a href="https://kennethreitz.org/" rel="nofollow">https://kennethreitz.org/</a>”&gt;Kenneth Reitz&lt;/a&gt; &amp;\nis protected by The &lt;a href=”<a href="https://www.python.org/psf/" rel="nofollow">https://www.python.org/psf/</a>”&gt;Python Software Foundation&lt;/a&gt;.</dd>\n</dl>\n<p>&lt;/p&gt;</p>\n</dd>\n</dl>\n<p>&lt;/pre&gt;</p>\n<p>&lt;/span&gt;</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<p>&lt;p align=”center”&gt;&lt;strong&gt;Requests&lt;/strong&gt; is an elegant and simple HTTP library for Python, built with ♥.&lt;/p&gt;</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<p><tt>`python\n&gt;&gt;&gt; import requests\n&gt;&gt;&gt; r = <span class="pre">requests.get(\'https://api.github.com/user\',</span> <span class="pre">auth=(\'user\',</span> <span class="pre">\'pass\'))</span>\n&gt;&gt;&gt; r.status_code\n200\n&gt;&gt;&gt; <span class="pre">r.headers[\'content-type\']</span>\n\'application/json; charset=utf8\'\n&gt;&gt;&gt; r.encoding\n<span class="pre">\'utf-8\'</span>\n&gt;&gt;&gt; r.text\n<span class="pre">\'{&quot;type&quot;:&quot;User&quot;...\'</span>\n&gt;&gt;&gt; r.json()\n{\'disk_usage\': 368627, \'private_gists\': 484, <span class="pre">...}</span>\n`</tt></p>\n<hr class="docutils">\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<p>Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your <cite>PUT</cite> &amp; <cite>POST</cite> data — but nowadays, just use the <cite>json</cite> method!</p>\n<p>Requests is one of the most downloaded Python package today, pulling in around <cite>14M downloads / week</cite>— according to GitHub, Requests is currently [depended upon](<a href="https://github.com/psf/requests/network/dependents?package_id=UGFja2FnZS01NzA4OTExNg%3D%3D" rel="nofollow">https://github.com/psf/requests/network/dependents?package_id=UGFja2FnZS01NzA4OTExNg%3D%3D</a>) by <cite>500,000+</cite> repositories. You may certainly put your trust in this code.</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;\n&lt;p align=”center”&gt;&lt;a href=”<a href="https://pepy.tech/project/requests" rel="nofollow">https://pepy.tech/project/requests</a>” rel=”nofollow”&gt;&lt;img src=”<a href="https://camo.githubusercontent.com/e1dedc9f5ce5cd6b6c699f33d2e812daadcf3645/68747470733a2f2f706570792e746563682f62616467652f7265717565737473" rel="nofollow">https://camo.githubusercontent.com/e1dedc9f5ce5cd6b6c699f33d2e812daadcf3645/68747470733a2f2f706570792e746563682f62616467652f7265717565737473</a>” alt=”Downloads” data-canonical-src=”<a href="https://pepy.tech/badge/requests" rel="nofollow">https://pepy.tech/badge/requests</a>” style=”max-width:100%;”&gt;&lt;/a&gt;\n&lt;a href=”<a href="https://pypi.org/project/requests/" rel="nofollow">https://pypi.org/project/requests/</a>” rel=”nofollow”&gt;&lt;img src=”<a href="https://camo.githubusercontent.com/6d78aeec0a9a1cfe147ad064bfb99069e298e29b/68747470733a2f2f696d672e736869656c64732e696f2f707970692f707976657273696f6e732f72657175657374732e737667" rel="nofollow">https://camo.githubusercontent.com/6d78aeec0a9a1cfe147ad064bfb99069e298e29b/68747470733a2f2f696d672e736869656c64732e696f2f707970692f707976657273696f6e732f72657175657374732e737667</a>” alt=”image” data-canonical-src=”<a href="https://img.shields.io/pypi/pyversions/requests.svg" rel="nofollow">https://img.shields.io/pypi/pyversions/requests.svg</a>” style=”max-width:100%;”&gt;&lt;/a&gt;\n&lt;a href=”<a href="https://github.com/psf/requests/graphs/contributors" rel="nofollow">https://github.com/psf/requests/graphs/contributors</a>”&gt;&lt;img src=”<a href="https://camo.githubusercontent.com/a70ea15870b38bba9203b969f6a6b7e7845fbb8a/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f636f6e7472696275746f72732f7073662f72657175657374732e737667" rel="nofollow">https://camo.githubusercontent.com/a70ea15870b38bba9203b969f6a6b7e7845fbb8a/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f636f6e7472696275746f72732f7073662f72657175657374732e737667</a>” alt=”image” data-canonical-src=”<a href="https://img.shields.io/github/contributors/psf/requests.svg" rel="nofollow">https://img.shields.io/github/contributors/psf/requests.svg</a>” style=”max-width:100%;”&gt;&lt;/a&gt;&lt;/p&gt;</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<p>&lt;h2 align=”center”&gt;Supported Features &amp; Best–Practices&lt;/h2&gt;</p>\n<p>Requests is ready for the demands of building robust and reliable HTTP–speak applications, for the needs of today.</p>\n<dl>\n<dt>&lt;pre class=”test”&gt;</dt>\n<dd><ul>\n<li><p>International Domains and URLs       + Keep-Alive &amp; Connection Pooling</p>\n</li>\n<li><p>Sessions with Cookie Persistence     + Browser-style SSL Verification</p>\n</li>\n<li><p>Basic &amp; Digest Authentication        + Familiar <cite>dict</cite>–like Cookies</p>\n</li>\n<li><p>Automatic Decompression of Content   + Automatic Content Decoding</p>\n</li>\n<li><p>Automatic Connection Pooling         + Unicode Response Bodies&lt;super&gt;*&lt;/super&gt;</p>\n</li>\n<li><p>Multi-part File Uploads              + SOCKS Proxy Support</p>\n</li>\n<li><p>Connection Timeouts                  + Streaming Downloads</p>\n</li>\n<li><p>Automatic honoring of <cite>.netrc</cite>       + Chunked HTTP Requests</p>\n<blockquote>\n<p>&amp;, of course, rock–solid stability!</p>\n</blockquote>\n</li>\n</ul>\n</dd>\n</dl>\n<p>&lt;/pre&gt;</p>\n<p>&lt;/div&gt;</p>\n<dl>\n<dt>&lt;p align=”center”&gt;</dt>\n<dd>✨ 🍰 ✨&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;</dd>\n</dl>\n<p>&lt;/p&gt;</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<div id="requests-module-installation">\n<h2>Requests Module Installation</h2>\n<p>The recommended way to install the <cite>requests</cite> module is to simply use [<cite>pipenv</cite>](<a href="https://pipenv.kennethreitz.org" rel="nofollow">https://pipenv.kennethreitz.org</a>) (or <cite>pip</cite>, of\ncourse):</p>\n<p><tt>`console\n$ pipenv install requests\nAdding requests to Pipfile\'s [packages]…\n✔ Installation Succeeded\n…\n`</tt></p>\n<p>Requests officially supports Python 2.7 &amp; 3.5+.</p>\n<hr class="docutils">\n<p>## P.S. —&nbsp;Documentation is available at [<cite>requests.readthedocs.io</cite>](<a href="https://requests.readthedocs.io/en/latest/" rel="nofollow">https://requests.readthedocs.io/en/latest/</a>).</p>\n<dl>\n<dt>&lt;p align=”center”&gt;</dt>\n<dd>&lt;a href=”<a href="https://requests.readthedocs.io/" rel="nofollow">https://requests.readthedocs.io/</a>”&gt;&lt;img src=”<a href="https://raw.githubusercontent.com/psf/requests/master/ext/ss.png" rel="nofollow">https://raw.githubusercontent.com/psf/requests/master/ext/ss.png</a>” align=”center” /&gt;&lt;/a&gt;</dd>\n</dl>\n<p>&lt;/p&gt;</p>\n<hr class="docutils">\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<dl>\n<dt>&lt;p align=”center”&gt;</dt>\n<dd>&lt;a href=”<a href="https://kennethreitz.org/" rel="nofollow">https://kennethreitz.org/</a>”&gt;&lt;img src=”<a href="https://raw.githubusercontent.com/psf/requests/master/ext/kr.png" rel="nofollow">https://raw.githubusercontent.com/psf/requests/master/ext/kr.png</a>” align=”center” /&gt;&lt;/a&gt;</dd>\n</dl>\n<p>&lt;/p&gt;</p>\n<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;</p>\n<dl>\n<dt>&lt;p align=”center”&gt;</dt>\n<dd>&lt;a href=”<a href="https://www.python.org/psf/" rel="nofollow">https://www.python.org/psf/</a>”&gt;&lt;img src=”<a href="https://raw.githubusercontent.com/psf/requests/master/ext/psf.png" rel="nofollow">https://raw.githubusercontent.com/psf/requests/master/ext/psf.png</a>” align=”center” /&gt;&lt;/a&gt;</dd>\n</dl>\n<p>&lt;/p&gt;</p>\n</div>\n'

(if it had failed to render, it would have returned None)

This blows my mind because I have tried to write valid reStructuredText so many times and failed, and yet this somehow passes.

Since the render passes, PyPI assumes this is what you wanted, and this is what it renders.

So basically, everything is independently operating exactly as they should be here, but together they are producing this seemingly strange behavior.

I'd recommend updating the version of wheel in whatever environment you build releases in. Additionally, you can opt into PEP517/518 with pyproject.toml and specify the version of wheel:

[build-system]
requires = ["setuptools", "wheel>=0.31.0"]
build-backend = "setuptools.build_meta"

or add this to your setup invocation in setup.py:

    setup_requires=["wheel>=0.31.0"],

to ensure this doesn't happen again.

nateprewitt commented 4 years ago

Thanks for taking the time to do such a thorough run down. I'm also amazed at the number of things that had to be in place for this to happen. I took a look and sure enough our lock file and release script don't specify a wheel version. I'll get the safeguards setup so we don't hit this again. Thanks!