pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.58k stars 965 forks source link

How to update the "Description" in pypi.org ? #2170

Closed guolinke closed 6 years ago

guolinke commented 7 years ago

In the old pypi.python.org, we can update the description by using edit or upload a pkg-info file. But now it is not support:

Gone (This API has been deprecated and removed from legacy PyPI in favor of using the APIs available in the new PyPI.org implementation of PyPI (located at https://pypi.org/). For more information about migrating your use of this API to PyPI.org, please see https://packaging.python.org/guides/migrating-to-pypi-org/#uploading. For more information about the sunsetting of this API, please see https://mail.python.org/pipermail/distutils-sig/2017-June/030766.html)

But in pypi.org, I cannot find any function to add or change the description of the package : https://pypi.org/project/lightgbm/

I also tried to update it by using python setup.py register , but it failed: Server response (410): This API is no longer supported, instead simply upload the file.

So I tried to upload the pkg-file, but sill met the error: ValueError: Unknown distribution format: 'PKG-INFO'

computron commented 7 years ago

I was able to get the description on PyPI to update for my package by:

  1. Editing the "description" and "long_description" fields in the setup() function of setup.py
  2. Releasing a new version of the code to PyPI (which still seems to work)

Without releasing a new version of the code, I also couldn't figure out a way to edit the description.

See also: https://docs.python.org/2/distutils/packageindex.html for more on the long_description field

jeremyarr commented 7 years ago

I would say it happens pretty frequently that you forgot to update a Trove Classifier or there was a typo in the long description field. It would be nice to be able to edit these fields for an existing release.

hroest commented 7 years ago

I would say it happens pretty frequently that you forgot to update a Trove Classifier or there was a typo in the long description field. It would be nice to be able to edit these fields for an existing release.

I agree, it would already be helpful if one could update the description by re-uploading a new version of the same file

hroest commented 7 years ago

another idea that came up in the chat log that would still let authors have some power over this:

J08nY commented 7 years ago

Duplicate of #424. Still, no idea why the old API was shutdown, with the new one lacking any editing capability.

swistakm commented 7 years ago

I have stumbled upon it today. I use markdown or other markup in multiple a projects as a markup language of my READMEs (for a various reasons). It is a very common convention to use this README as a basis of long_description metadata field. Usually I use pypandoc or similar tool to convert my README to rst during sdist or wheel creation and this mostly works without any problems. Sometimes I have even to modify dynamically the README during creation of package distribution just before uploading it to PyPI.

In such cases no matter how much of automation you employ it is still easy to miss something or make mistake due to broken local dependency or incomplete environment. Till this time I have used to use setup.py register or twine register to just upload new updated/fixed metadata for given version of the package.

@computron's approach (using new release) is an approach that works well for simple packages that can be distributed as universal wheels or sdists. But it is very problematic for projects that require packages to be actually compiled.

I have some projects (like pyimgui or pyrilla) that depend on C extensions and their distributions needs to be created on multiple independent CI systems for various systems (3 different OS in 2 architecture flavours, 4 versions o Python). Full distribution consists of 25 built wheels and one sdist. It takes a lot of time to prepare whole set of distributions just for the purpose of fixing some README typo. It also takes a lot of space. Wheels for pyimgui project are about 32 MB. But think about projects like numpy. It has also more that 20 wheels and their size ranges from 6 to 15 MB each.

I understand that not requiring package pre-registration is a good step towards simplifying repository usage. Still it would be useful to have ability to modify/upload metadata independently from the actual package distributions. Note that metadata (mainly long description) describe one version of application that may consist of multiple distributions and not every distribution may be built in the same homogenic environment. It means that in more complex scenarios there may be some issues and metadata may not be consistent between different distributions (e.g. due to lack of pandoc on certain environment in my case). Selecting metadata of the first uploaded distribution as a representative of given package version is a reasonable approach that works in most of the situations but also create problems for some projects.

Also, I believe that having ability to easily modify metadata errors for packages (e.g. like using register endpoint) will have beneficial effect on the package repository content in the long term:

techtonik commented 7 years ago

It could store edits separately, like GitHub shows that message is edited.

jobec commented 7 years ago

Bumped into the same issue.

While generating a wheel the DESCRIPTION.rst file got corrupted due to some strange bug (it added a > in front of a section header). I only noticed after uploading and now the description of my package is messed up, without any way to change the description but releasing a new version and manually modifying the DESCRIPTION.rst file in the weel...

The current situation is insane. Or you fully migrate, or leave the old features enabled untill you reach feature parity...

jobec commented 7 years ago

FYI: Issue https://github.com/pypa/wheel/issues/189 is what corrupted my package description.

dstufft commented 7 years ago

I realized I had not yet commented on this issue so here goes:

I'm not sure yet how we're going to treat this. On one hand it is fairly convenient to be able to do a quick bit of edit without making a new release of the project-- I've personally used that and I'm sure others have as well (As reflected by this issue!). On the other hand, allowing that means that the contents of the PyPI database differ from the contents of the packages themselves which is less than ideal given that many different downstream consumers of these projects want to consume the description of the projects too.

I don't believe either situation is more or less "right" or "sane", but rather represents a trade off. PyPI tends to have to make these kinds of trade offs between restricting what authors are allowed to do to provide downstream users with some sort of consistency in what they can expect and giving authors the power to manage their projects in the best way they personally see fit. I think that it is more common in other language packaging systems to not allow updates to this field than it is to allow updates (and the fact that PyPI ever supported it appears to me to be an accident) so I think it's certainly not unreasonable to not have this functionality (if that is the ultimate decision here).

Right now I'm leaning towards being fairly rigid in the guarantees of immutability and leaving this as it is (in addition likely fixing the data model to divide between truly file specific data and general data and guaranteeing the general data is the same for every file uploaded for a specific version).

I'm leaving this issue open so others can weigh in, and because I'm not fully settled on that being the right idea yet or not. One possible idea that I've been considering is the suggestions from above about allowing edits, but retaining a sort of history or adding an indicator that it has been edited from the "true" description inside of the package. My main concern with that is that if it's just history, then I'm not sure that people will generally find it useful-- but if it's some sort of a red warning box, then that is making editing the description painful for little benefit over just allowing it without warning and likely will introduce "warning fatigue".

So basically, I'm not entirely sure yet how the future looks for this feature.

jobec commented 7 years ago

Why not allow package versions to be "staged"?

You can then see if everything is correct without others seeing the package. In that status you can then make changes and even update the files.

The next step is then to "publish" the packages, after which it becomes immutable and visible for everyone.

xavfernandez commented 7 years ago

Why not allow package versions to be "staged"?

cf #726 :)

dstufft commented 7 years ago

Yea, I am entirely on board with staged releases-- and it hadn't occurred to me util you mentioned it that they could act as a mechanism for this as well. I think that is likely going to be the long term answer here, that if you want to preview things before they go live, use the staging functionality (yet to be implemented) but once things are published they are immutable.

ryukinix commented 7 years ago

This is really annoying. :/ But ok, at least a wont-fix tag was not marked as some development teams do about crucial UX problems.

bskinn commented 7 years ago

I think immutability is an important feature of the final, published packages and should be kept.

But a major point of stress in uploading releases is not knowing how my README.rst is going to render on PyPI. Yes, there's testpypi, but I have to keep changing my version in setup.py, which is annoying (v1.2.1.post112....) AND I have to remember to change it back to the actual version number before uploading to "regular" PyPI.

I agree, the staged upload/release approach seems very appealing.

dstufft commented 7 years ago

You can also use readme_renderer to render your readme.

bskinn commented 7 years ago

@dstufft Where does the python setup.py check -r -s invocation put the rendered readme?

Nothing shows up with a dir /s *.htm*.

di commented 7 years ago

@bskinn That command doesn't output any HTML, it will show warnings if there are issues with the long_description, otherwise it will say nothing and return with an exit code of 0.

bskinn commented 7 years ago

<nod>, gotcha.

So -- if I read the readme_renderer source right, there's no script entry point to actually carry out the translation locally? To get HTML I need to do something like this?

>>> from readme_renderer import rst

>>> with open('README.rst', 'r') as f:
...     text = f.read()

>>> with open('README.html', 'w') as f:
...     f.write(rst.render(text))
di commented 7 years ago

@bskinn Nope, since that isn't really the purpose of readme_renderer (although there is an open PR to add it here: https://github.com/pypa/readme_renderer/pull/53)

If that's what you need, you can use what you wrote above, or there are other non-PyPA tools which do that and more, such as https://pypi.org/project/restview/

bskinn commented 7 years ago

@di Good to know. Thanks!

pjeby commented 6 years ago

Here's another use case for editing descriptions: I'm moving most of my packages to Github and closing the mailing list; almost every package description links to that mailing list and I need to remove it. But I cannot do so without releasing new versions of the packages.

Most of these packages haven't been changed in years (up to a decade or more!) and making a new release in order to say that I've moved the package to Github because the project is shelved and won't be getting updates makes little sense. :)

Is there any way to fix this without uploading new versions of these packages? I really don't want to release new versions of them, just to change the contents of their PyPI pages.

ryukinix commented 6 years ago

It's so painful uploading a new package only because I made a typo on description or just the formatting is wrong... I hope this change over the years. Staged builds will help a lot.

techtonik commented 6 years ago

But ok, at least a wont-fix tag was not marked as some development teams do about crucial UX problems.

PyCon needs more talks about UX, because too often it looks like people don't get it at all.

wmayner commented 6 years ago

Just got burned by this too. There should definitely be a way of fixing typos in READMEs without releasing a new version.

regebro commented 6 years ago

@dstufft "On the other hand, allowing that means that the contents of the PyPI database differ from the contents of the packages themselves"

They already can, since you can upload multiple packages with different metadata. A solution to this is to not take the metadata from the FIRST uploaded file, but take it from the LAST. As it is now, if you have a mistake in your long description you need to make a new release version. So I'm already up at my third release today. This is really, really annoying.

The immutability requirement is an illusion. I understand it for files, that if you delete a version you need to bump the version number. But for data, no.

If we keep immutability for data, then we need a way to first prepare releases and preview them online, and then make them immutable by publishing. In that case it should also be verified before publishing that all uploaded files have the same metadata. But I think that's silly. :-)

dstufft commented 6 years ago

I'm going to close this issue out, as after thinking about it I think that keeping this immutable but providing features to either ensure the metadata is correct going forward is the correct way of handling this rather than allowing modifications after the fact to the metadata in PyPI.

This can be followed up with on distutils-sig (see the discussion here) but I've reproduced the initial message below as well.


For those who are not aware, legacy PyPI would allow you to run twine register on a release that had already been created in order to modify the metadata that PyPI had recorded for that release (keyed by version number). This wasn’t a super widely used feature, but it’s primary use case was when folks would mistakenly release a project that had a broken description that wouldn’t correctly render on PyPI. With the move from the legacy PyPI code base to Warehouse for handling uploads, this feature had been somewhat inadvertently lost.

This issue was raised in https://github.com/pypa/warehouse/issues/2170. This email is provide notice that after thinking about this issue for awhile, we’re not going to restore the ability to modify the metadata associated with a release after it has been uploaded.The eventual goal is that it should be ideally possible to treat the files that are on PyPI as the source of truth for all metadata, and that the data stored in the database is simply an optimization for accessing and presenting that data on PyPI. Obviously if we allow modifications to the metadata as stored in the PyPI database, that would allow this metadata to “drift” from what is actually stored in the files, which would prevent that goal from being realized.

It is true that even with this change, there is not a guarantee that the metadata in the database does not match what is in the file(s) that have been uploaded to PyPI, even going into the future. Thus the decision to not restore this feature is not the only step on the way to being able to assert this guarantee, but it is one of them.

The most common reason for wanting to modify any metadata after the fact is to fix typos etc that made it into the description prior to publication. It is our opinion that the best way to handle these is to either cut a new release (it can be a post release if that’s all that has changed) or to validate the description field prior to publication (which can currently be done using readme_renderer or restview). Longer term we also plan to introduce the ability to “stage” releases so that releases and files can be uploaded to a temporary location within Warehouse, and visible with a special URL to allow people to manually validate the actual bits that were uploaded before pushing a “finalize” or “publish” button that would flip it from being a mutable, hidden release to an immutable, publicly viewable release.

If folks have other use cases where they’ve used the ability to modify release metadata after it had been released that they feel is an important enough use case that it needs to be supported in some way, please let us know either in this thread or by opening an issue on https://github.com/pypa/warehouse so we can figure out if it’s something we want to support, if one of the other mechanisms we’re planning on adding will support it, or if there is some new mechanism we can add that can support it better.

brainwane commented 6 years ago
pjeby commented 6 years ago

I have yet to see anyone explaining how to handle things like changing the support location, maintainer email, home page, etc. without releasing a new package version. These are not typos, they're things that have changed, while the package itself has not changed. I've got packages that have been unchanged for a decade on PyPI but which I need to change these links for. Releasing new versions of the packages would send the wrong message about how current they are.

jobec commented 6 years ago

I second that.

We're not talking about source code here but about metadata. Changing descriptions etc does not change the functionality of the package.

pjeby commented 6 years ago

Indeed. The framing of this discussion is all about warehouse and immutability and metadata, yet what we the users are asking for is the ability to edit what appears on the PyPI pages for human viewers, in fields that either:

  1. can't be used in an automated way anyway (e.g. the description), or
  2. have no reason to be immutable! (i.e., support contact info, human readable links, maintainer email, etc.)

There was no user benefit to making all this uneditable when them being editable on the original PyPI was a feature, not a bug.

dstufft commented 6 years ago

The flip side of that, is if you're not cutting a new release of them or communicating the changes using something like a redirect or a new landing page at the old locations, then downstream consumers of the packages who do not use the PyPI metadata but use the metadata that is internal to the package itself are not going to reflect those changes.

This could be as simple as someone running pip show <foo> to grab the home page link, or it could be done in tooling like is present in various Linux distributions. The only reliable mechanism there is for communicating a change like that is by releasing a new version or by updating the existing locations to point to the new location.

This is also explicitly what the .postN release segment is for, as per PEP 440. If you see a post release as a consumer you can assume nothing code wise has changed and the only changes have been to documentation or other non functionality impacting.

pjeby commented 6 years ago

A critical distinction here: the metadata that needs to change is project metadata, not release metadata.

Who you contact for support with a project or where you go to find its source repository has nothing to do with the actual contents of an individual release. That PyPI mixes project and release metadata is a historical accident and arguably a design flaw. Until now, the flaw was relatively minor in nature because you could work around it by directly editing the human-readable metadata when you needed to.

As for using redirects, I expect at some point to lose access to the domain used on many of my projects. Redirecting will only work so long as the domain actually exists. Likewise, the email address is at that domain, and will only forward for so long as well.

The argument that somebody could have stale info in the package isn't really meaningful here, either: cutting a new release won't change the possibility of stale versions existing in the field. However, if the info somebody has doesn't work, their likely next step would be to check the PyPI page for the project.

Zac-HD commented 6 years ago

If folks have other use cases where they’ve used the ability to modify release metadata after it had been released that they feel is an important enough use case that it needs to be supported in some way, please let us know.

One we don't have but could use: adding the python_requires tag on an existing release.

I added the tag in django/django#9413 before version 2.0.1 - but because 2.0.0 doesn't have the tag, pip install django fails on Python 2. Can we (ie @timgraham) ask for a manual update so this stops happening?

ewdurbin commented 6 years ago

@Zac-HD We ran through this for another project recently. Let me dig up my notes. Can you confirm that the python_requires metadata on 2.0.1 is precisely what needs to go into 2.0.0's metadata?

Zac-HD commented 6 years ago

Yes, I can - 2.0.1 has requires_python>=3.4, and this is precisely what should have been set on 2.0.0. Thanks!

ewdurbin commented 6 years ago

Proposed SQL for notes:

BEGIN;

SELECT
  name,
  version,
  packagetype,
  requires_python
FROM release_files
WHERE name='Django'
  AND version LIKE '2%';

UPDATE release_files
SET requires_python=(SELECT requires_python
                     FROM release_files
                     WHERE name='Django'
                       AND version='2.0.1'
                       AND packagetype='sdist')
WHERE name='Django'
  AND version IN (
      '2.0',
      '2.0a1',
      '2.0b1',
      '2.0rc1'
  );

SELECT
  name,
  version,
  packagetype,
  requires_python
FROM release_files
WHERE name='Django'
  AND version LIKE '2%';

SELECT * FROM journals WHERE name='Django' AND version LIKE '2%' ORDER BY submitted_date DESC;

INSERT INTO journals
  (name, version, action, submitted_by, submitted_from)
VALUES
  ('Django', '2.0', 'update requires_python', 'ewdurbin', '162.243.84.248'),
  ('Django', '2.0a1', 'update requires_python', 'ewdurbin', '162.243.84.248'),
  ('Django', '2.0b1', 'update requires_python', 'ewdurbin', '162.243.84.248'),
  ('Django', '2.0rc1', 'update requires_python', 'ewdurbin', '162.243.84.248');

SELECT * FROM journals WHERE name='Django' AND version LIKE '2%' ORDER BY submitted_date DESC;

---COMMIT;
ewdurbin commented 6 years ago

@Zac-HD @timgraham I note that the pre-releases of Django 2.0 do not have this metadata set either, should I include them as well?

=> select name, version, packagetype, requires_python from release_files where name='Django' and version like '2%';
  name  | version | packagetype | requires_python 
--------+---------+-------------+-----------------
 Django | 2.0     | sdist       | 
 Django | 2.0     | bdist_wheel | 
 Django | 2.0.1   | sdist       | >=3.4
 Django | 2.0.1   | bdist_wheel | >=3.4
 Django | 2.0a1   | bdist_wheel | 
 Django | 2.0b1   | bdist_wheel | 
 Django | 2.0rc1  | bdist_wheel | 
(7 rows)
timgraham commented 6 years ago

Sure

ewdurbin commented 6 years ago

@timgraham @Zac-HD complete.

=> SELECT                                                                                        
  name,
  version,
  packagetype,
  requires_python
FROM release_files
WHERE name='Django'
  AND version LIKE '2%';
  name  | version | packagetype | requires_python 
--------+---------+-------------+-----------------
 Django | 2.0     | sdist       | >=3.4
 Django | 2.0     | bdist_wheel | >=3.4
 Django | 2.0.1   | sdist       | >=3.4
 Django | 2.0.1   | bdist_wheel | >=3.4
 Django | 2.0a1   | bdist_wheel | >=3.4
 Django | 2.0b1   | bdist_wheel | >=3.4
 Django | 2.0rc1  | bdist_wheel | >=3.4
swistakm commented 6 years ago

This is ridiculous.

For months we hear that making new release is the only reasonable way to fix metadata and everything about single package should be immutable. Multiple requests from many developers prove that ability to fix metadata after a release is a useful feature that few of us would like to have back. I never agreed that sacrificing usefulness and convenience here for the sake of design elegance is a good direction but accepted the final decision and didn't come back to this topic.

Then, a month after the issue was officially closed we can see metadata updates are fixed manually and this is done completely in public. I can see that precedent was set even earlier (see: #2700). I understand that Django and Pytest are pretty big projects that are important for the community but can creators of smaller projects expect same level of support too?

ewdurbin commented 6 years ago

@swistakm I understand the frustration and recognize the dissonance here.

The specific reason why these two issues were addressed was that they affected the installability of the packages. Rather than being an issue of a typo fix or other "vanity" concern these projects were no longer installable for older Python versions.

In both cases pip install <project> executed from an environment with an up to date pip and an older Python would fail, and this could not have been resolved by having the maintainers release a new package.

As should be obvious by a bunch of sketchy SQL, we don't have a rock solid and safe way to address issues like this on a wider level without the intervention of one of three folks who have the necessary access to execute the database operations.

I'm happy to assist in resolving issues that create wide-scale installability issues for any project that can't be resolved by releasing a new package, requires_python situations being the most problematic ones.

wmayner commented 6 years ago

For the record, I mostly want to be able to update metadata for precisely that reason, rather than "vanity" concerns.

It seems to me that these examples expose a real design flaw with not allowing metadata updates. Metadata isn't data, especially when the metadata determines whether you can even get the data...

pjeby commented 6 years ago

And yet, somehow the previous version of PyPI was able to do such changes, because it was originally a package index for humans, and didn't even support uploading files when it was first created.

But somehow, the utility of being a place where humans can find out about Python projects has now been re-labeled as "vanity". And important functionality was dropped without notice to PyPI users. When other PyPI functionality was dropped, PEPs were needed with notices periods being sent to maintainers, but removing a feature from the UI? Nah, who actually edits their package. Who cares. And we'll just break setup.py register while we're at it. Oh, people are complaining? Too bad, we already changed it.

Running manual updates for "installability" is just the icing on the cake. It demonstrates that the capability to edit release metadata is in fact needed, and all the platitudes about not needing to do it are just that: platitudes defending the purity of the system, when practicality beats purity.

In the pure land, nobody needs to edit metadata and should output new binaries. In the real world, stuff happens. (And a package listing isn't solely for automated consumption.)

ewdurbin commented 6 years ago

First I want to clarify that the word vanity was used not in a derogatory sense, but to try to indicate information which was not functional for installation purposes. Clearly my use of scare quotes around the word didn't help. I apologize for this word choice, it obviously struck a chord.

Second, it was obviously a mistake to provide this support to pytest and Django without a larger discussion. This issue in particular started as a place to track the concern of updating descriptions for projects and has continued well past that specific concern and into a wider one around arbitrary edits of package metadata.

I'm not sure where to go from here, but I am frustrated and feeling down about the whole thing now.

bskinn commented 6 years ago

The model of 'write-once per-release metadata' implicitly assumes that the metadata, and the knowledge underlying its composition, are both perfectly complete and perfectly accurate for each release at the time the release is made.

It's a stretch to think these assumptions hold for any one release of any distribution, much less all releases of all distributions.

I understand the impulse toward immutability, in terms of avoiding 'noncritical post-release updates' to metadata. But, as the pytest and Django examples show, strict immutability means that critical post-release updates are also locked out.

I see two (non-~exclusive~-exhaustive) general paths forward, personally:

1) Unmanaged post-release updates. 'Noncritical' updates are accepted as an unavoidable cost of allowing 'critical' updates, with the benefit of reduced ongoing burden on PyPA staff due to not having to intermediate the updates

2) Managed post-release updates. Post-release updates are effected by petition to PyPA staff, with case-by-case approval. Disadvantage here is the ongoing burden on PyPA staff both as re the time involved, and as re being the arbiters of criticality. A further risk is that denied petitions may engender ill will.

In both cases, perhaps some of the metadata can be declared as universally disallowed for post-release updates. This would probably still be fraught with edge cases, though.

pjeby commented 6 years ago

Conversely, perhaps some of the metadata could be universally allowed to change independently, such as changes in maintainers, support contact, home page, etc.

After all, none of these attributes are really about the release in the first place. They're about the project, and can change independently of the project's releases or lack thereof. (For example, if a project's issue tracker or home page moves, visiting the old page is not useful even if you are working with an older version. Likewise, a URL such as this changing is not really a reason to issue a new release of a project.)

EmilStenstrom commented 6 years ago

Another use-case: I have a README.md in the project (on github), but I forgot to add it to long_description. So my use-case is not a typo, but making sure PyPI is in sync with GitHub.

bskinn commented 6 years ago

@EmilStenstrom Especially given the intentional behavior change from legacy PyPI to Warehouse disabling automatic introspection for README.rst, I suspect there are a lot of projects (some of mine included) that could benefit from being able to make this kind of change post facto.

jobec commented 6 years ago

Well, according to https://mail.python.org/pipermail/distutils-sig/2017-December/031826.html and despite the numerous reactions here this decision seems final 😞

(Unless your package is like Django, then all of a sudden direct database changes are possible...)