python / importlib_metadata

Library to access metadata for Python packages
https://importlib-metadata.readthedocs.io
Apache License 2.0
122 stars 79 forks source link

distributions() finds two packages for editable installs #481

Open Jacob-Stevens-Haas opened 6 months ago

Jacob-Stevens-Haas commented 6 months ago

Cross-posting from https://github.com/pypa/setuptools/issues/4170, since I'm not sure which is the right repo to fix this in, or whether both repos have something to consider here:

tl;dr: When setuptools does an editable install, importlib.metadata.distributions() finds a PathDistribution for both the dist-info directory in site-packages as well as a PathDistribution for the local egg-info directory. See the above issue for MWE repo. The current consequences for importlib.metadata are:

So regardless of anything setuptools needs to do differently in creating editable installs, what is the expected behavior when importlib metadata finds multiple packages of the same name? I can think of a few options...

  1. Do nothing, because there's legitimate reasons for two distribution packages with the same name that provide different/partially-overlapping/fully-conflicting import package names.
  2. Try to resolve whether they actually mean the same thing, such as comparing associated import packages and following some heirarchical rule (e.g. dist-info is better egg-info)
  3. Raise a warning/exception because importlib.metadata can't tell which metadata will be associated with the package that gets imported

I can't think of a legitimate reason for two distribution package names to be installed simultaneously with the exact same import package names, so I'm not sure 1 is correct. But IIUC and I probably don't, 2 or 3 would change the expected behavior of a DistributionFinder, and I'm not sure what the performance cost of disambiguation would be.

Jacob-Stevens-Haas commented 6 months ago

Interestingly... there's a difference in behavior for importlib.metadata in python3.10.12 and importlib_metadata 7.0.1. The latter behaves slightly better for src-style layouts, showing the second distribution package as having no files. But also, once I install importlib_metadata, pip show importlib.metadata (with the dot) shows that it's in site-packages, not a builtin? That seems intentional, but it's a bit confusing as to what I'm importing now with importlib.metadata once importlib_metadata is installed.

jaraco commented 4 months ago

I can think of a few options

The intended design is for importlib metadata (both versions) to reflect and honor the actual configuration. That is, if there are two metadata for the same package, they'll both be returned, but some functions, like distribution() will return the first one (similar to how import foo would return the first module if there are two on sys.path). That's essentially option 2, except the precedence is given not by the newness of the metadata standard used but by the sys.path resolution.

That is to say, it is more-or-less working as intended. Ideally, this issue will be addressed by refining the packaging tools to only produce one copy of the metadata or to ensure that the preferred metadata is given precedence in sys.path.

pip show importlib.metadata (with the dot) shows that it's in site-packages

That's because pip performs the specified name normalization before resolving the package name, so importlib.metadata, importlib_metadata, and importlib-metadata are all the same thing to pip. So even though you're typing importlib.metadata, you're getting the third-party package importlib_metadata. Standard library packages like importlib.metadata don't get any metadata.