pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.54k stars 952 forks source link

format/naming for additional project description details, existing samples? #8888

Open prof-milki opened 3 years ago

prof-milki commented 3 years ago

type: discussion scope: doc metadata (!pep566) desc: Not a bug report. More of a support inquiry.

What's the problem this feature will solve?

Is there an established convention for documenting "soft" project properties? Most projects on pypi use some free form long_description, with spurious detail on package and development status. Wheel meta data and READMEs usually don't suffice to pick an appropriate implementation. As user, I'd like to see broader details to draw comparisons. E.g. expected development progress, platform standard recognition ("unix-ness"), versioning or release schemes, or list of "compatible" APIs or projects. (I know, fairly vague, because I'm also unsure where this is going.)

Though I do very very (very) vaguely remember a few projects having some markdown table of sorts, which went into detail on a few topics. Mostly on testing compliancy or stuff IIRC.

Additional context

Fictional example:

| meta           | info          |
|:---------------|:--------------|
| depends        | [pysim....... |
| compat         | Py ≥3.6,..... |
| compliancy     | xdg, !pep8,.. |
| system usage   | opportune.... |
| paths          | ~/.config/... |
| testing        | few data-dr.. |
| docs           | yelp, no manp |
| dev activity   | burst, te.... |
| state          | beta          |
| support        | tickets       |
| contrib        | mail, git.... |
| announce       | [freshc...... |

Are similar description blobs floating around on pypi? (Haven't bothered straining the API to uncover tables, to be honest). Also obviously I'm looking for a naming scheme that doesn't solely duplicate the package meta data.

Specifically something like "Compliancy: xdg, netrc, !pep8, !doap, mallard" might be most useful. With a few well-known shorthands or vocabulary that could perhaps benefit searchability. Might even be useful to asses how upfront project descriptions are (e.g. pip itself would be 'pep8' compatible, but ignores '!xdg' and distributes no '!man' page on its own).

Describe the solution you'd like

This request might be more approp on distutils-sig@, but alas that's seemingly being faced out. Anyway, is there project-description table markdown like that residing on pypi?

Can someone craft a list from the warehouse db with projects having table markup or some key:value or definition lists?

di commented 3 years ago

Hi @prof-milki, thanks for the feature request. It seems like what you're asking for is some sort of free-form metadata fields, but within the project description. Is that correct?

Can someone craft a list from the warehouse db with projects having table markup or some key:value or definition lists?

This can probably be done with the public dataset of project metadata, which includes the description: https://warehouse.readthedocs.io/api-reference/bigquery-datasets.html#project-metadata-table

hugovk commented 3 years ago

Fictional example:

| meta           | info          |
|:---------------|:--------------|
| depends        | [pysim....... |
| compat         | Py ≥3.6,..... |
| compliancy     | xdg, !pep8,.. |
| system usage   | opportune.... |
| paths          | ~/.config/... |
| testing        | few data-dr.. |
| docs           | yelp, no manp |
| dev activity   | burst, te.... |
| state          | beta          |
| support        | tickets       |
| contrib        | mail, git.... |
| announce       | [freshc...... |

For many of these you can use Trove classifiers and project_urls.

For example, this in a setup.py:

    project_urls={
        "Documentation": "https://tablib.readthedocs.io",
        "Source": "https://github.com/jazzband/tablib",
    },
    classifiers=[
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'Natural Language :: English',
        'License :: OSI Approved :: MIT License',
        'Programming Language :: Python',
        'Programming Language :: Python :: 3 :: Only',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
    ],
    python_requires='>=3.6',

Produces https://pypi.org/project/tablib/ with this in the sidebar:

This links into https://github.com/pypa/warehouse/issues/5947 "Document recommended keys for project_urls".

prof-milki commented 3 years ago

Hey @Di,

Yes, that's where my question was going. I was mainly wondering if there's an established description block for such notes.

This can probably be done with the public dataset of project metadata, which includes the description: https://warehouse.readthedocs.io/api-reference/bigquery-datasets.html#project-metadata-table

You know, I gave it a run now that I found some no-subscription BigQuery logon: https://console.cloud.google.com/bigquery. Albeit I probably need to refine my search attempts:

SELECT DISTINCT name
 FROM `the-psf.pypi.distribution_metadata`
WHERE description IS NOT NULL
  AND 
     REGEXP_CONTAINS(description, r'\n\|(\s*(compa|compli|depend|standard|test)|.*(xdg|pep)).*\|')
LIMIT 100

This has largely just yielded RST image markup (badges, or badges in tables ;).
I should probably try to just scan for tables, and then look through a text export.

prof-milki commented 3 years ago

Hi @hugovk,

Thanks. The example was a bit misleading I guess. There's most certainly some overlap with regular project meta data, but what I was looking for is some more prosaic description schemes. Dependencies and stability are already part of the wheel meta data of course. Though it doesn't hurt to duplicate them and expand on in the long_description. Which is what I'm after here.

For many of these you can use Trove classifiers and project_urls.

Personally I made a little setup() wrapper, which picks up on pluginspec fields, such as #state:. It's actually too greedy on trove tags right now. That wrapper also populates python_requires and install_requires et al.

This links into #5947 "Document recommended keys for project_urls".

Curiously my METADATA contains a Project-URL: Faq, https://… which warehouse seems not to pick up on at the moment. (No biggy, not essential anyhow.)

I think my super vague question might be better rephrased to: how to document features or project management in a discernable description. It's also about duplicating package details, but expanding quite a bit. The "Compliancy:" field for instance (name completely made up) might document respected platform standards or supported file formats for instance.
And that's project details which some libraries already provide in some form or the the other. I was merely wondering if there's a common scheme for such things.