Closed benoit-pierre closed 2 years ago
Hi @benoit-pierre could this be related to https://github.com/pypa/setuptools/pull/3395?
It's the switch to importlib_metadata / nspektr, my experience trying to switch from pkg_resources, apart from all the (to be expected) issues due to loosing the accumulated experience of pkg_resources, is that performance is bad.
Not to mention more work compared to using pkg_resources.
In #3395, I identified that dist._install_dependencies
(which is the function that calls nspektr
) is the second biggest time spent when building packages. I am proposing in that PR a simple check that can avoid runing nspektr
at all.
I am not very sure about importlib_metadata
(it should be faster than pkg_resources
right?).
I am not very sure about
importlib_metadata
(it should be faster thanpkg_resources
right?).
Only if you have a very limited number of calls to the API. And because importlib_metadata
does not do half the stuff that's provided by pkg_resources
, it's more work and slower... In pkg_resources
, because most functions will hit the working set, all the work is already done, so it's slower for example if you don't need metadata access to most distributions, but if you do, then the working set will act as a cache, and it's faster.
Honestly, I don't even understand why _install_dependencies
was added: I thought setuptools was out of the business of installing build requirements (fetch_build_egg
is deprecated).
Thank you very much @benoit-pierre for the explanation. I think the majority of places importlib.metadata
is currently being used (at least directly) in setuptools is to access the entry-points, so it should not cause that much of a problem...
Honestly, I don't even understand why
_install_dependencies
was added: I thought setuptools was out of the business of installing build requirements (fetch_build_egg
is deprecated).
I guess backwards compatibility would be the answer here. But I think we can fairly skip it in most of the cases which would provide a significant performance boost (no need to analyse the working set...) 😅
But that's my point, it does not look like backward compatibility, more like a new feature. Assuming PEP 518 is in charge of ensuring the build requirements are installed, why is there a need to check each entry points dependencies? Even if PEP 518 support was not called, isn't the (deprecated) code path that handles setup_requires
(calling fetch_build_egg
) in charge of making sure that all requirements are indeed present?
pip
(and by extension build
) does not honour the "extras" part of the entry-points (module:obj [extras]
).
When setup_requires
was a thing, it would make sure that the extras
was honoured when calling the entry-points (including distutils.commands
). This would be a fundamental difference between setup_requires
and PEP 517, and what motivates the effort for backward compatibility (although I am not 100% sure).
OK, so the old code would use ep.require(installer=self.fetch_build_egg)
. That's from 2005 (v0.6a0
!), just before adding support for setup_requires
in v0.6a1
. So again, I'm not even sure it's needed.
I agree with you. I think it is a fair assumption that entry-points nowadays are not defining extras
.
pip
(and by extensionbuild
) does not honour the "extras" part of the entry-points (module:obj [extras]
).
Because that's not pip's job. If you want to use an entry point that depend on an extra, you should make sure you're specifying the extra at install.
So in that case, you would make sure the corresponding setup_requires
requirement has the necessary extra.
I agree with you. I think it is a fair assumption that entry-points nowadays are not defining
extras
.
That's not what I'm saying.
Sorry, let me correct myself: "I also think that entry-points nowadays are not defining extras
".
Your line of thought makes absolute sense. However, we cannot deny that old setuptools code would fix missing extras
, so indeed we can think about the dist._install_dependencies
as backwards compatibility.
Do I think we can get away without it? Absolutely.
I use entry points with extras, there's nothing wrong with that.
I think the check for an entrypoint missing dependency is not strictly necessary, and can be dropped: it's the package developer responsibility to ensure that setup_requires
is consistent.
It looks like we don't have a test for this behavior.
Here's a test:
setuptools/tests/test_easy_install.py | 66 +++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git i/setuptools/tests/test_easy_install.py w/setuptools/tests/test_easy_install.py
index 246d634f2..286af292d 100644
--- i/setuptools/tests/test_easy_install.py
+++ w/setuptools/tests/test_easy_install.py
@@ -890,6 +890,72 @@ class TestSetupRequires:
monkeypatch.setenv(str('PIP_TIMEOUT'), str('0'))
run_setup(test_setup_py, [str('--version')])
+ def test_setup_requires_with_distutils_command_dep(self, monkeypatch):
+
+ with contexts.save_pkg_resources_state():
+ with contexts.tempdir() as temp_dir:
+ # Create source distribution for `extra_dep`.
+ make_sdist(os.path.join(temp_dir, 'extra_dep-1.0.tar.gz'), [
+ ('setup.py',
+ DALS("""
+ import setuptools
+ setuptools.setup(
+ name='extra_dep',
+ version='1.0',
+ py_modules=['extra_dep'],
+ )
+ """)),
+ ('setup.cfg', ''),
+ ('extra_dep.py', ''),
+ ])
+ # Create source tree for `epdep`.
+ dep_pkg = os.path.join(temp_dir, 'epdep')
+ os.mkdir(dep_pkg)
+ path.build({
+ 'setup.py':
+ DALS("""
+ import setuptools
+ setuptools.setup(
+ name='dep', version='2.0',
+ py_modules=['epcmd'],
+ extras_require={'extra': ['extra_dep']},
+ entry_points='''
+ [distutils.commands]
+ epcmd = epcmd:epcmd [extra]
+ ''',
+ )
+ """),
+ 'setup.cfg': '',
+ 'epcmd.py': DALS("""
+ from distutils.command.build_py import build_py
+
+ import extra_dep
+
+ class epcmd(build_py):
+ pass
+ """),
+ }, prefix=dep_pkg)
+ # "Install" dep.
+ run_setup(
+ os.path.join(dep_pkg, 'setup.py'), [str('dist_info')])
+ working_set.add_entry(dep_pkg)
+ # Create source tree for test package.
+ test_pkg = os.path.join(temp_dir, 'test_pkg')
+ test_setup_py = os.path.join(test_pkg, 'setup.py')
+ os.mkdir(test_pkg)
+ with open(test_setup_py, 'w') as fp:
+ fp.write(DALS(
+ '''
+ from setuptools import installer, setup
+ setup(setup_requires='dep')
+ '''))
+ # Check...
+ monkeypatch.setenv(str('PIP_FIND_LINKS'), str(temp_dir))
+ monkeypatch.setenv(str('PIP_NO_INDEX'), str('1'))
+ monkeypatch.setenv(str('PIP_RETRIES'), str('0'))
+ monkeypatch.setenv(str('PIP_TIMEOUT'), str('0'))
+ run_setup(test_setup_py, ['epcmd'])
+
def make_trivial_sdist(dist_path, distname, version):
"""
Which does not pass in v62.6.0
because the code forgets to activate the egg:
setuptools/dist.py | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git i/setuptools/dist.py w/setuptools/dist.py
index c1ad30080..79905ed89 100644
--- i/setuptools/dist.py
+++ w/setuptools/dist.py
@@ -926,9 +926,12 @@ class Distribution(_Distribution):
Given an entry point, ensure that any declared extras for
its distribution are installed.
"""
+ if not ep.extras:
+ return
for req in nspektr.missing(ep):
# fetch_build_egg expects pkg_resources.Requirement
- self.fetch_build_egg(pkg_resources.Requirement(str(req)))
+ dist = self.fetch_build_egg(pkg_resources.Requirement(str(req)))
+ sys.path.insert(0, dist.location)
def get_egg_cache_dir(self):
egg_cache_dir = os.path.join(os.curdir, '.eggs')
Going back to v60.9.3
, the code is also buggy:
setuptools/dist.py | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git i/setuptools/dist.py w/setuptools/dist.py
index e825785e2..3a02a61cc 100644
--- i/setuptools/dist.py
+++ w/setuptools/dist.py
@@ -876,23 +876,28 @@ class Distribution(_Distribution):
Given an entry point, ensure that any declared extras for
its distribution are installed.
"""
+ if not ep.extras:
+ return
reqs = {
req
for req in map(requirements.Requirement, always_iterable(ep.dist.requires))
- for extra in ep.extras
- if extra in req.extras
+ if req.marker is not None and any(
+ req.marker.evaluate({'extra': extra})
+ for extra in ep.extras
+ )
}
missing = itertools.filterfalse(self._is_installed, reqs)
for req in missing:
# fetch_build_egg expects pkg_resources.Requirement
- self.fetch_build_egg(pkg_resources.Requirement(str(req)))
+ dist = self.fetch_build_egg(pkg_resources.Requirement(str(req)))
+ sys.path.insert(0, dist.location)
def _is_installed(self, req):
try:
dist = metadata.distribution(req.name)
except metadata.PackageNotFoundError:
return False
- found_ver = packaging.version.Version(dist.version())
+ found_ver = packaging.version.Version(dist.version)
return found_ver in req.specifier
def get_egg_cache_dir(self):
v60.7.1
works as expected. But again, I think the onus should be on the package developer to ensure setup_requires
is consistent. The example should be amended:
- setup(setup_requires='dep')
+ setup(setup_requires='dep[extra]')
(which works as expected in v60.7.1
, v60.9.3
, and v62.6.0
)
And _install_requires
' code and corresponding calls can just be removed.
Thank you very much @benoit-pierre for the deep investigation. Since _install_requires
does not seem to be having any effect, it could be removed right now... (if someone was relying on the feature, we would have noticed the issues right?).
Thank you very much @benoit-pierre for the deep investigation. Since
_install_requires
does not seem to be having any effect, it could be removed right now... (if someone was relying on the feature, we would have noticed the issues right?).
Agreed, PR here: #3421.
I've previously been mostly unconcerned about runtime performance in Setuptools as most use-cases run in human time, so correctness and compatibility beats performance in most cases. But I agree that any degradation of more than 1s is noticeable and probably impactful in some cases.
Thanks everyone for the investigation. I haven't delved deep into it, and I'm happy to move forward with simplifying the codebase.
setuptools version
setuptools>=60.8.0
Python version
3.10.5
OS
Arch Linux
Additional environment information
No response
Description
It's slow...
Expected behavior
It's not slow.
How to Reproduce
Output
…