pypa / twine

Utilities for interacting with PyPI
https://twine.readthedocs.io/
Apache License 2.0
1.61k stars 308 forks source link

README on PyPI rendered incorrectly if upload includes a wheel #227

Closed tjanez closed 7 years ago

tjanez commented 7 years ago

Steps to reproduce the problem

I've reproduced this with the Resolwe project:

git clone https://github.com/genialis/resolwe.git
cd resolwe
mkvirtualenv resolwe
pip install --process-dependency-links -e .[docs,package,test]

For each of the variants, I've made the following common steps:

# bump version in resolwe/__about__.py
python setup.py clean -a
rm dist/*
rm -r *.egg-info

Then I've tested the following 4 variants:

Variant 1: Make sdist and whell and upload with twine

python setup.py sdist
python setup.py bdist_wheel
twine upload -r testpypi dist/*

Result is here -> README not rendered correctly.

Variant 2: Make sdist and whell and upload with setup.py upload

python setup.py sdist bdist_wheel upload -r testpypi

Result is here -> README rendered correctly.

Variant 3: Make sdist and upload with twine

python setup.py sdist
twine upload -r testpypi dist/*

Result is here -> README rendered correctly.

Variant 4: Make wheel and upload with twine

python setup.py bdist_wheel
twine upload -r testpypi dist/*

Result is here -> README not rendered correctly.

Summary

It appears that README is rendered incorrectly if twine upload includes a wheel.

Additional notes

The same problem also occurred when I uploaded the Resolwe 1.4.0 release to the real PyPI using Variant 1 described above. Note that the README is also incorrectly rendered on the new Warehouse-based PyPI frontend.

System information:

Fedora 24 with Python 3.5.2 and twine 1.8.1.

All contents of the virtualenv:

$ pip list --format=columns
Package                     Version    Location                    
--------------------------- ---------- ----------------------------
alabaster                   0.7.9      
appdirs                     1.4.0      
args                        0.1.0      
astroid                     1.4.9      
Babel                       2.3.4      
bleach                      1.5.0      
check-manifest              0.34       
clint                       0.5.1      
coverage                    4.3.4      
Django                      1.10.5     
django-autoslug             1.9.4.dev0 
django-filter               0.15.3     
django-guardian             1.4.6      
django-mathfilters          0.4.0      
django-versionfield2        0.5.0      
djangorestframework         3.5.3      
djangorestframework-filters 0.9.1      
docutils                    0.13.1     
elasticsearch               2.4.1      
elasticsearch-dsl           2.2.0      
html5lib                    0.9999999  
imagesize                   0.7.1      
isort                       4.2.5      
Jinja2                      2.9.4      
jsonschema                  2.5.1      
lazy-object-proxy           1.2.2      
MarkupSafe                  0.23       
mccabe                      0.6.0      
mock                        2.0.0      
packaging                   16.8       
pbr                         1.10.0     
pip                         9.0.1      
pkginfo                     1.4.1      
psycopg2                    2.6.2      
pycodestyle                 2.2.0      
pydocstyle                  1.1.1      
Pygments                    2.2.0      
pylint                      1.6.5      
pyparsing                   2.1.10     
python-dateutil             2.6.0      
pytz                        2016.10    
PyYAML                      3.12       
readme-renderer             16.0       
requests                    2.13.0     
requests-toolbelt           0.7.0      
resolwe                     1.4.0.4    /home/tadej/Genialis/resolwe
resolwe-runtime-utils       1.1.0      
setuptools                  34.0.2     
six                         1.10.0     
snowballstemmer             1.2.1      
Sphinx                      1.5.2      
sphinx-rtd-theme            0.1.9      
testfixtures                4.13.3     
twine                       1.8.1      
urllib3                     1.20       
wheel                       0.30.0a0   
wrapt                       1.10.8
sigmavirus24 commented 7 years ago

Twine made a conscious decision to always upload wheels first several versions ago. PyPI/Warehouse can extract a significant amount more metadata from a package when we do that. Because of this, it's highly unlikely we'll ever stop uploading wheels first.

The bizarre thing is that your READMEs are different based on sdist output and bdist_wheel output. I've never seen one differ from the other, so I have no clue what's causing that to happen. The difference in output is what I would consider to be the real issue here. Not the order in which twine uploads packages.

dstufft commented 7 years ago

Unless twine is somehow reading the Wheel metadata wrong... but I can't think of how we would be, particularly since a lot of projects are uploading just fine. However, Looking at the github repository for Reslowe I also can't see how they would be getting a different long description for sdist and wheel.

sigmavirus24 commented 7 years ago

I was just downloading the package files to check. =)

sigmavirus24 commented 7 years ago

The READMEs are different in the two different archives.

In the sdist, it's like this:

Metadata-Version: 1.1
Name: resolwe
Version: 1.4.0
Summary: Open source enterprise dataflow engine in Django
Home-page: https://github.com/genialis/resolwe
Author: Genialis d.o.o.
Author-email: dev-team@genialis.com
License: Apache License (2.0)
Description: =======
        Resolwe
        =======

        |build| |coverage| |docs| |pypi_version| |pypi_pyversions|

        .. |build| image:: https://travis-ci.org/genialis/resolwe.svg?branch=master
            :target: https://travis-ci.org/genialis/resolwe
            :alt: Build Status

        .. |coverage| image:: https://img.shields.io/codecov/c/github/genialis/resolwe/master.svg
            :target: http://codecov.io/github/genialis/resolwe?branch=master
            :alt: Coverage Status

        .. |docs| image:: https://readthedocs.org/projects/resolwe/badge/?version=latest
            :target: http://resolwe.readthedocs.io/
            :alt: Documentation Status

        .. |pypi_version| image:: https://img.shields.io/pypi/v/resolwe.svg
            :target: https://pypi.python.org/pypi/resolwe
            :alt: Version on PyPI

        .. |pypi_pyversions| image:: https://img.shields.io/pypi/pyversions/resolwe.svg
            :target: https://pypi.python.org/pypi/resolwe
            :alt: Supported Python versions

        .. |pypi_downloads| image:: https://img.shields.io/pypi/dm/resolwe.svg
            :target: https://pypi.python.org/pypi/resolwe
            :alt: Number of downloads from PyPI

        Resolwe is an open source dataflow package for `Django framework`_. We envision
        Resolwe to follow the `Common Workflow Language`_ specification, but the
        current implementation does not yet fully support it. Resolwe offers a complete
        RESTful API to connect with external resources. A collection of bioinformatics
        pipelines is available in `Resolwe Bioinformatics`_.

        .. _Django framework: https://www.djangoproject.com/
        .. _Common Workflow Language: https://github.com/common-workflow-language/common-workflow-language
        .. _Resolwe Bioinformatics: https://github.com/genialis/resolwe-bio

        Docs & Help
        ===========

        Read about architecture, getting started, how to write `processes`, RESTful API
        details, and API Reference in the documentation_.

        To chat with developers or ask for help, join us on Slack_.

        .. _documentation: http://resolwe.readthedocs.io/
        .. _Slack: http://resolwe.slack.com/

        Install
        =======

        Prerequisites
        -------------

        Make sure you have Python_ (2.7 or 3.4+) installed on your system. If you don't
        have it yet, follow `these instructions
        <https://docs.python.org/3/using/index.html>`__.

        Resolwe requires PostgreSQL_ (9.4+). Many Linux distributions already include
        the required version of PostgreSQL (e.g. Fedora 22+, Debian 8+, Ubuntu 15.04+)
        and you can simply install it via distribution's package manager.
        Otherwise, follow `these instructions
        <https://wiki.postgresql.org/wiki/Detailed_installation_guides>`__.

        Additionally, installing the ``psycopg2`` dependency from PyPI_ will require
        having a C compiler (e.g. GCC_) as well as Python and PostgreSQL development
        files installed on the system.

        Note
        ^^^^

        The preferred way to install the C compiler and Python and PostgreSQL
        development files is to use your distribution's packages, if they exist. For
        example, on a Fedora/RHEL-based system, that would mean installing ``gcc``,
        ``python-devel``/``python3-devel`` and ``postgresql-devel`` packages.

        .. _Python: https://www.python.org/
        .. _PostgreSQL: http://www.postgresql.org/
        .. _PyPi: https://pypi.python.org/
        .. _GCC: https://gcc.gnu.org/

        From PyPI_
        ----------

        .. code::

            pip install --process-dependency-links resolwe

        From source
        -----------

        .. code::

           pip install --process-dependency-links https://github.com/genialis/resolwe/archive/<git-tree-ish>.tar.gz

        where ``<git-tree-ish>`` can represent any commit SHA, branch name, tag name,
        etc. in `Resolwe's GitHub repository`_. For example, to install the latest
        Resolwe from the ``master`` branch, use:

        .. code::

           pip install --process-dependency-links https://github.com/genialis/resolwe/archive/master.tar.gz

        .. _`Resolwe's GitHub repository`: https://github.com/genialis/resolwe/

        Contribute
        ==========

        We welcome new contributors. To learn more, read Contributing_ section of our
        documentation.

        .. _Contributing: http://resolwe.readthedocs.io/en/latest/contributing.html

Keywords: resolwe dataflow django
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Framework :: Django
Classifier: Intended Audience :: Developers
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Internet :: WWW/HTTP :: WSGI
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5

In the wheel it's this:

Metadata-Version: 2.0
Name: resolwe
Version: 1.4.0
Summary: Open source enterprise dataflow engine in Django
Home-page: https://github.com/genialis/resolwe
Author: Genialis d.o.o.
Author-email: dev-team@genialis.com
License: Apache License (2.0)
Keywords: resolwe dataflow django
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Framework :: Django
Classifier: Intended Audience :: Developers
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Internet :: WWW/HTTP :: WSGI
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Requires-Dist: Django (~=1.10.5)
Requires-Dist: djangorestframework (>=3.4.0)
Requires-Dist: djangorestframework-filters (>=0.9.1)
Requires-Dist: django-autoslug (==1.9.4-dev)
Requires-Dist: django-guardian (>=1.4.2)
Requires-Dist: django-mathfilters (>=0.3.0)
Requires-Dist: django-versionfield2 (>=0.5.0)
Requires-Dist: elasticsearch-dsl (~=2.2.0)
Requires-Dist: psycopg2 (>=2.5.0)
Requires-Dist: mock (>=1.3.0)
Requires-Dist: PyYAML (>=3.11)
Requires-Dist: jsonschema (>=2.4.0)
Requires-Dist: six (>=1.10.0)
Requires-Dist: Sphinx (>=1.5.1)
Requires-Dist: Jinja2 (>=2.8)
Requires-Dist: shutilwhich; python_version == "2.7"
Provides-Extra: docs
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Provides-Extra: package
Requires-Dist: twine; extra == 'package'
Requires-Dist: wheel; extra == 'package'
Provides-Extra: test
Requires-Dist: check-manifest; extra == 'test'
Requires-Dist: coverage (>=4.2); extra == 'test'
Requires-Dist: pycodestyle (>=2.1.0); extra == 'test'
Requires-Dist: pydocstyle (>=1.0.0); extra == 'test'
Requires-Dist: pylint (>=1.6.4); extra == 'test'
Requires-Dist: readme-renderer; extra == 'test'
Requires-Dist: resolwe-runtime-utils (>=1.1.0); extra == 'test'
Requires-Dist: testfixtures (>=4.10.0); extra == 'test'

=======
Resolwe
=======

|build| |coverage| |docs| |pypi_version| |pypi_pyversions|

.. |build| image:: https://travis-ci.org/genialis/resolwe.svg?branch=master
    :target: https://travis-ci.org/genialis/resolwe
    :alt: Build Status

.. |coverage| image:: https://img.shields.io/codecov/c/github/genialis/resolwe/master.svg
    :target: http://codecov.io/github/genialis/resolwe?branch=master
    :alt: Coverage Status

.. |docs| image:: https://readthedocs.org/projects/resolwe/badge/?version=latest
    :target: http://resolwe.readthedocs.io/
    :alt: Documentation Status

.. |pypi_version| image:: https://img.shields.io/pypi/v/resolwe.svg
    :target: https://pypi.python.org/pypi/resolwe
    :alt: Version on PyPI

.. |pypi_pyversions| image:: https://img.shields.io/pypi/pyversions/resolwe.svg
    :target: https://pypi.python.org/pypi/resolwe
    :alt: Supported Python versions

.. |pypi_downloads| image:: https://img.shields.io/pypi/dm/resolwe.svg
    :target: https://pypi.python.org/pypi/resolwe
    :alt: Number of downloads from PyPI

Resolwe is an open source dataflow package for `Django framework`_. We envision
Resolwe to follow the `Common Workflow Language`_ specification, but the
current implementation does not yet fully support it. Resolwe offers a complete
RESTful API to connect with external resources. A collection of bioinformatics
pipelines is available in `Resolwe Bioinformatics`_.

.. _Django framework: https://www.djangoproject.com/
.. _Common Workflow Language: https://github.com/common-workflow-language/common-workflow-language
.. _Resolwe Bioinformatics: https://github.com/genialis/resolwe-bio

Docs & Help
===========

Read about architecture, getting started, how to write `processes`, RESTful API
details, and API Reference in the documentation_.

To chat with developers or ask for help, join us on Slack_.

.. _documentation: http://resolwe.readthedocs.io/
.. _Slack: http://resolwe.slack.com/

Install
=======

Prerequisites
-------------

Make sure you have Python_ (2.7 or 3.4+) installed on your system. If you don't
have it yet, follow `these instructions
<https://docs.python.org/3/using/index.html>`__.

Resolwe requires PostgreSQL_ (9.4+). Many Linux distributions already include
the required version of PostgreSQL (e.g. Fedora 22+, Debian 8+, Ubuntu 15.04+)
and you can simply install it via distribution's package manager.
Otherwise, follow `these instructions
<https://wiki.postgresql.org/wiki/Detailed_installation_guides>`__.

Additionally, installing the ``psycopg2`` dependency from PyPI_ will require
having a C compiler (e.g. GCC_) as well as Python and PostgreSQL development
files installed on the system.

Note
^^^^

The preferred way to install the C compiler and Python and PostgreSQL
development files is to use your distribution's packages, if they exist. For
example, on a Fedora/RHEL-based system, that would mean installing ``gcc``,
``python-devel``/``python3-devel`` and ``postgresql-devel`` packages.

.. _Python: https://www.python.org/
.. _PostgreSQL: http://www.postgresql.org/
.. _PyPi: https://pypi.python.org/
.. _GCC: https://gcc.gnu.org/

>From PyPI_
----------

.. code::

    pip install --process-dependency-links resolwe

>From source
-----------

.. code::

   pip install --process-dependency-links https://github.com/genialis/resolwe/archive/<git-tree-ish>.tar.gz

where ``<git-tree-ish>`` can represent any commit SHA, branch name, tag name,
etc. in `Resolwe's GitHub repository`_. For example, to install the latest
Resolwe from the ``master`` branch, use:

.. code::

   pip install --process-dependency-links https://github.com/genialis/resolwe/archive/master.tar.gz

.. _`Resolwe's GitHub repository`: https://github.com/genialis/resolwe/

Contribute
==========

We welcome new contributors. To learn more, read Contributing_ section of our
documentation.

.. _Contributing: http://resolwe.readthedocs.io/en/latest/contributing.html

Note the differences in two of the headers that have > in front of them. I suspect that's the problem.

sigmavirus24 commented 7 years ago

So, again, this is not Twine's fault. We don't generate or modify these archives. I suspect this was a problem in generating the archives on your end. Cheers!

jamadden commented 7 years ago

Note the differences in two of the headers that have > in front of them. I suspect that's the problem.

Interesting. Replacing a line that begins with "From" with ">From" is an old trick used for unix mbox files to escape what otherwise looks like headers. So something in the wheel stack must be going through code that does that escaping?

jamadden commented 7 years ago

Yeah, it looks like METADATA in a wheel goes through the email parser for some reason...

sigmavirus24 commented 7 years ago

@jamadden yeah, the PKG-INFO and METADATA files are actually valid Email headers. I suspect it doesn't happen to the sdist because Metadata 1.1 has the contents of the long-description indented while wheel doesn't.

jamadden commented 7 years ago

@sigmavirus24 So does that suggest that PyPI isn't reading the wheel metadata correctly, i.e., isn't unescaping?

dstufft commented 7 years ago

Presumably it's an issue with the pkginfo library.

sigmavirus24 commented 7 years ago

I agree with @dstufft. I'll reopen this as an action item to look into next week or the week after.

sigmavirus24 commented 7 years ago

In the case of a wheel, pkginfo runs the text through the email parser twice, but it only appears to do so once for sdists.

I don't see how that's relevant. They both independently open the file, read it, and then have the email library parse it.

jamadden commented 7 years ago

In the case of a wheel, the description comes directly from the email payload, but for sdists it comes directly from the header value. And headers aren't escaped like this. So there is a definite difference.

The short answer seems to be that when the wheel metadata is generated, the generator needs to be instantiated with mangle_from=False (the default is true). That's done by the wheel library, not the pkginfo library. Mangled froms aren't part of RFC822 (or even RFC 2046 MIME or RFC 3676 text/plain---and email.parser parses RFC2046 messages), they're an extension for unix mbox files.

jamadden commented 7 years ago

In the case of a wheel, pkginfo runs the text through the email parser twice, but it only appears to do so once for sdists.

I don't see how that's relevant. They both independently open the file, read it, and then have the email library parse it.

My bad, I accidentaly hit 'comment' on a work in progress and immediately deleted it.

sigmavirus24 commented 7 years ago

@jamadden seems like you have a better handle on this than I do at this point. Care to open a bug report against wheel?

jamadden commented 7 years ago

I can, I just want to be sure I'm in the right place. Is bitbucket really still the canonical repository for wheel?

dstufft commented 7 years ago

Yes it is.

jamadden commented 7 years ago

https://bitbucket.org/pypa/wheel/issues/178/wheelpkginfowrite_pkg_info-should-set

agronholm commented 7 years ago

I've fixed this in wheel now.

blipk commented 2 years ago

I'm having this problem, I'm generating a README from a template in my setup.py before setup(), the generated one is included fine in the source dist, but the .WHL is only including the template.

sigmavirus24 commented 2 years ago

@blipk everything in python is moving away from dynamic packaging like you describe. You're better served abandoning that or making it a pre-build step