Open Jacob-Stevens-Haas opened 11 months ago
Hi @Jacob-Stevens-Haas, thank you very much for opening this discussion.
For the time being this is a limitation for the combo setuptools
and importlib-metadata
interoperating together...
The current design of setuptools requires the .egg-info
folders as part of the building process and intentionally places them at the root of the repository for flat-layout projects.
We do have a milestone for removing the egg-info
, https://github.com/pypa/setuptools/milestone/3, but I don't think that is a goal that can be achieved in the short term.
If this turns up to be problematic for you, please consider the following workarounds while the long-term implementation is not ready:
.egg-info
folder will be placed inside the src
folder and then not picked up by importlib-metadata
)..egg-info
paths from the output of importlib-metata
.If any member of the community is interested in contributing towards the goal of removing the reliance on .egg-info
directories, contributions are always welcomed.
Thanks for the quick reply! And yeah, that would be a fine workaround. Given that removing egg-info
might take a while... would implementing the workaround (ignoring egg-info distributions if there's a same-name dist-info distribution) inside importlib.metadata
be reasonable?
- Consider using a src-layout (if I am not mistaken with src-layout, the
.egg-info folder
will be placed inside thesrc
folder and then not picked up byimportlib-metadata
).
I have a branch in the above repo with an src layout, and starting everything from scratch with that layout gives similar results:
'foo'
'pip'
'setuptools'
'foo'
[PackagePath('__editable__.foo-0.1.0.pth'),
PackagePath('foo-0.1.0.dist-info/INSTALLER'),
PackagePath('foo-0.1.0.dist-info/METADATA'),
PackagePath('foo-0.1.0.dist-info/RECORD'),
PackagePath('foo-0.1.0.dist-info/REQUESTED'),
PackagePath('foo-0.1.0.dist-info/WHEEL'),
PackagePath('foo-0.1.0.dist-info/direct_url.json'),
PackagePath('foo-0.1.0.dist-info/top_level.txt')]
[PackagePath('pyproject.toml'),
PackagePath('src/foo/__init__.py'),
PackagePath('src/foo.egg-info/PKG-INFO'),
PackagePath('src/foo.egg-info/SOURCES.txt'),
PackagePath('src/foo.egg-info/dependency_links.txt'),
PackagePath('src/foo.egg-info/top_level.txt')]
{'_distutils_hack': ['setuptools'],
'debian': ['setuptools'],
'foo': ['foo', 'foo'],
'pip': ['pip'],
'pkg_resources': ['setuptools'],
'setuptools': ['setuptools']}
Is this because during python startup, importing site
reads __editable__.foo-0.1.0.pth
and adds the src
directory to sys.path
? Interestingly, this changes the order of the packages, as the egg-info
is no longer found in ""
. It thus means that importlib.metadata.distribution("foo")
finds the correct package... which is a win for me, but IDk if this behavior is reliable.
If any member of the community is interested in contributing towards the goal of removing the reliance on .egg-info directories, contributions are always welcomed.
I'd love to, but realistically I'll probably just learn more about setuptools and why removing reliance on egg-info is so daunting... the classic "know enough to be dangerous... but not to be useful" stage.
Thanks for the quick reply! And yeah, that would be a fine workaround. Given that removing egg-info might take a while... would implementing the workaround (ignoring egg-info distributions if there's a same-name dist-info distribution) inside importlib.metadata be reasonable?
That is something to be discussed in the importlib.metadata
repo, but that would break setuptools 😅 (because the existing design relies on that).
I have a branch in the above repo with an src layout, and starting everything from scratch with that layout gives similar results Is this because during python startup, importing
site
reads__editable__.foo-0.1.0.pth
and adds thesrc
directory tosys.path
?
I see... Yeap, that is correct. The src
layout will add the src
-directory as a new entry to sys.path
, end then impotlib.metadata
will catch it. That makes sense, sorry I didn't think about that.
Interestingly, this changes the order of the packages, as the
egg-info
is no longer found in""
. It thus means thatimportlib.metadata.distribution("foo")
finds the correct package... which is a win for me, but IDk if this behavior is reliable.
That is probably 90% reliable 😅. The ""
directory (which corresponds to the current work dir) is added by default as the first entry in sys.path
automatically depending on how you run a Python script, module or REPL. This is the reference (https://docs.python.org/3/using/cmdline.html):
-c <command>
If this option is given, the first element of sys.argv will be "-c" and the current directory will be added to the start of sys.path (allowing modules in that directory to be imported as top level modules).
-m <module-name>
... As with the -c option, the current directory will be added to the start of sys.path.
<script>
If the script name refers directly to a Python file, the directory containing that file is added to the start of sys.path, and the file is executed as the main module. If the script name refers to a directory or zipfile, the script name is added to the start of sys.path and the main.py file in that location is executed as the main module.
-I
option can be used to run the script in isolated mode where sys.path contains neither the current directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too.
And this is the reference for the .pth
file mechanism we use for adding entries to sys.path
in the editable install for the src-layout
: https://docs.python.org/3/library/site.html.
Ah, thanks for all that! After a cursory reading, does setuptools create an editable install as a PEP660 editable wheel? Or does the presence of an egg-info
directory locally imply otherwise?
Also, and this might not be the ideal solution, would it be possible to add direct_url.json to the egg-info directory?
Ah, thanks for all that! After a cursory reading, does setuptools create an editable install as a PEP660 editable wheel?
Ideally yes. But that will depend on how pip
calls setuptools. pip
has its own heuristics to decide when and how to call setuptools and in some edge cases it will rely on setuptools deprecated code paths.
Or does the presence of an egg-info directory locally imply otherwise?
The presence of the .egg-info
directory is NOT a direct/unequivocal indicator of the installation method that was used. It may be found even when the process described in PEP 660 is employed.
would it be possible to add direct_url.json to the egg-info directory?
The direct_url.json
file is a installer's thing. It is not something covered in the setuptools codebase/scope. Instead, pip
is the tool producing it.
setuptools version
setuptools 69.0.3
Python version
3.10.12
OS
Ubuntu
Additional environment information
Applies to both src/ and flat layout
Description
I was trying to identify editable packages installed in my current environment by looking at direct_url.json for a package given by
importlib.metadata.distribution(name)
. It was showing that file didn't exist. Upon further investigation,importlib.metadata.distributions()
had two entries for my package - onePathDistribution
who's files containsdist-info
in site-packages, and anotherPathDistribution
who's files containegg-info
, built by setuptools in the local directory.distribution(name)
only finds the local version. Interestingly,importlib.metadata.packages_distributions()
shows that the distribution packagefoo
has two import packages associated, both with the same names.Expected behavior
I would've expected just one distribution package for an editable install, in this case with a single import package associated. At a lower level, I'm not sure it really makes sense to ever have two distributions of the same name installed, and therefore perhaps setuptools should have internally raised an error when
distributions
finds two of the same name or two import packages with the same name in the same distribution.How to Reproduce
I've got an example distribution package, foo, with one import package, also named foo:
pip install -e .
python show_dists.py
This will print the results of
distributions()
, showing two named "foo", the files in the two matching distributions, and then thepackages_distributions()
results.src
branch) has same result.pip freeze
shows just a single distribution packageOutput