pypa / packaging-problems

An issue tracker for the problems in packaging
146 stars 34 forks source link

RECORD includes no package modules for editable installs #620

Open jaraco opened 1 year ago

jaraco commented 1 year ago

Problem description

In https://github.com/python/cpython/issues/96144, a user expressed their surprise when an editable-installed package had no importlib.metadata.files() pertaining to the project. The analysis there concluded that the issue is that the RECORD file for the dist-info did not include the files the user expected.

On one hand, it's infeasible to reflect the files of a package that's editable, as one can always add or delete a file from the package and the metadata will be stale.

On the other hand, it may be practical to reflect the state of the files at the time the package was editable-installed, which would provide users with a better experience and closer feature parity with a standard-installed package.

Should the packaging tools attempt to capture the "files that would be installed" and present those in the RECORD? Or maybe should these files be recorded in a separate metadata file, thereby leaving it up to importlib metadata to determine if/how to expose editable-installed "installed files"?

pfmoore commented 1 year ago

RECORD is used for uninstalls, so we should be cautious about adding other entries in there (especially entries in the user's working directory). I think we should leave RECORD clearly scoped as only containing the files that were added to site-packages. Note that the spec for the RECORD file states:

To completely uninstall a package, a tool needs to remove all files listed in RECORD, all .pyc files (of all optimization levels) corresponding to removed .py files, and any directories emptied by the uninstallation.

While we can of course change the spec, we should not do so if doing so could result in tools that predate the spec change breaking users' systems...

IMO, the only truly "unsurprising" result for importlib.metadata.files() would be to report the files actually in the package directory at the time of the call. In other words, a "live" result. I'm quite happy to accept that doing this is impractical, but I'm pretty sure that anything else would continue to trigger user reports along the lines of "I added my data file but it's not getting recognised".

We can document the limitations of whatever approach we take, but I think this is just skirting round the real problem here, which is that no-one is able (or willing) to precisely define what an "editable install" should actually contain.

pfmoore commented 1 year ago

Having just read the linked bug report, I agree with the conclusion there, that to do this properly needs a decision at the PyPA level. And frankly, that almost certainly needs us to standardise "Editable installs" - which is going to be a long and difficult discussion, I suspect, and quite probably something that people have no appetite for. (Standardising just the API was incredibly divisive, and that deliberately avoided asking "what actually is an editable install" as being too hard a question to tackle!)

If we do want to have this discussion, it'll need to happen on Discourse, rather than here, and it'll almost certainly need to end up in the form of a PEP.

jaraco commented 1 month ago

It occurs to me that even if editable packages were to expose some form of the "top level" packages they export, the importlib.metadata.files() method probably would still not work, at least without modification, as the current implementation assumes files() are relative to the metadata file.

However, for packages_distributions, it would be useful to have a list of top-level packages provided by the distribution.

jaraco commented 1 week ago

See https://github.com/python/importlib_metadata/issues/112#issuecomment-2343576673, where I touch on what might enable a tool like importlib metadata to resolve the files for an editable-installed package.