pypa / packaging-problems

An issue tracker for the problems in packaging
152 stars 35 forks source link

Ambiguity with 'Keywords' metadata field #186

Open di opened 6 years ago

di commented 6 years ago

The current specification for the Keywords field says:

A list of additional keywords to be used to assist searching for the distribution in a larger catalog.

Over at pypa/sampleproject, we say something similar:

# This field adds keywords for your project which will appear on the
# project page. What does your project relate to?
#
# Note that this is a string of words separated by whitespace, not a list.
keywords='sample setuptools development',  # Optional

Setuptools recommends both space-separated and comma separated values for this field:

[metadata]
name = my_package
version = attr: src.VERSION
description = My package description
long_description = file: README.rst, CHANGELOG.rst, LICENSE.rst
keywords = one, two

However, as of 3.7 distutils now makes the following distinction:

Changed in version 3.7: setup now raises a TypeError if classifiers, keywords and platforms fields are not specified as a list.

Given this, it may be a good idea for us to update the specification to allow for and prefer a list for this field. Additionally, we could introduce a Keyword (singular) metadata field, which would reproduce the behavior of other multi-use fields, though this might be overkill.

(cc @ncoghlan)

pfmoore commented 6 years ago

In principle, I'm in favour of making this a list. But there's a lot of projects out there which specify space-separated strings (pip, for one!) so we're going to have to continue to accept space-delimited strings for a long time yet.

Flit defines this field as a space-separated string - here. I haven't checked if it validates that.

dstufft commented 6 years ago

If Python 3.7 only accepts a list, we need to either fix that or we should start raising warnings everywhere.

njsmith commented 6 years ago

I guess there are two separate issues:

The spec currently seems pretty unambiguous actually... Given the example, it must be a space-separated list, and keywords cannot contain embedded spaces.

Given this, it's not clear to me that it matters what distutils does. The docs say that before 3.7, a string input was split on spaces, and then I guess joined on spaces to put in the field. After 3.7, a string input is treated as a list of a single item, which I guess is then put in the field directly... Which I think means it produces exactly the same metadata file at the end of the day?

pfmoore commented 6 years ago

The "what's new" comment that @di quoted implies that

setup(
    ...
    keywords = "a b c",
    ...
)

will fail under Python 3.7. I just did a test, creating a directory with just a setup.py file,

from distutils.core import setup

setup(
    name="test",
    version="0.1",
    keywords="a b c"
)

py -3.7 setup.py sdist reported the following:

running sdist
running check
warning: check: missing required meta-data: url

warning: check: missing meta-data: either (author and author_email) or (maintainer and maintainer_email) must be supplied

warning: sdist: manifest template 'MANIFEST.in' does not exist (using default file list)

warning: sdist: standard file not found: should have one of README, README.txt, README.rst

writing manifest file 'MANIFEST'
creating test-0.1
making hard links in test-0.1...
hard linking setup.py -> test-0.1
creating dist
Creating tar archive
removing 'test-0.1' (and everything under it)

So it actually looks like that comment in the Python docs is wrong :-(

di commented 6 years ago

Heh, I didn't even think to test it!

I suppose "correcting" the docs would be the easiest solution to this ambiguity, but we should probably make sure that this isn't a planned change.

pfmoore commented 6 years ago

Having said this, the spec isn't explicit about this, but it's clearly only talking about how the metadata is recorded in the METADATA files. It says nothing about how programs like distutils, setuptools or flit might choose to require users to input those values. And it's arguable that it's none of the standard's business whether distutils chooses to require a space-separated string, or a list (which it joins with spaces when writing the metadata).

So if that change had been made to distutils, it would be pretty lousy from a backward compatibility and user friendliness perspective, but not a violation of the spec. As it is, it looks like it may just be some sort of documentation mess :-(

TBH, I still think it's pretty poor that a change like this could be documented (much less implemented!) without a pointer to a justifying discussion :-(

di commented 6 years ago

Looks like this just hasn't made it into your 3.7 version yet @pfmoore: https://github.com/python/cpython/commit/dcaed6b2d954786eb5369ec2e8dfdeefe3cdc6ae

pfmoore commented 6 years ago

Looking at the code:

def set_keywords(self, value):
        # If 'keywords' is a string, it will be converted to a list
        # by Distribution.finalize_options(). To maintain backwards
        # compatibility, do not raise an exception if 'keywords' is
        # a string.
        if not isinstance(value, (list, str)):
            msg = "'keywords' should be a 'list', not %r"
            raise TypeError(msg % type(value).__name__)
        self.keywords = value

it looks like they still accept a string, even though the docs say otherwise.

Good, but bleh, seriously, someone didn't research that change very well :-(

di commented 6 years ago

Actually, looks like it was turned into just a warning: https://github.com/python/cpython/commit/8837dd092fe5ad5184889104e8036811ed839f98#diff-d9afd486aff62306cb23cb8be2d4458e

jackm commented 6 years ago

What would be the most version independent approach then? Just provide a comma separated string of keywords?