pypa / setuptools

Official project repository for the Setuptools build system
https://pypi.org/project/setuptools/
MIT License
2.51k stars 1.19k forks source link

[BUG] The `_distutils_hack` pth file causes significant startup slowdown #3006

Closed pablogsal closed 2 years ago

pablogsal commented 2 years ago

setuptools version

60.0.0

Python version

ALL

OS

ALL

Additional environment information

No response

Description

Since setuptools 60.0.0 has been released, Python startup has been impacted noticeably:

https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=1&revs=50&equid=off&quarts=on&extr=on

This is due to the new pth file that setuptools installs for injecting the distutils hack. The pth imports the _distutils_hack module and the ramifications of this import has a noticeable impact on startup time as you can see in the benchmark server.

Here is an estimation of the cost using -X importtime on my machine:

import time:       278 |        278 |         types
import time:       112 |        112 |           _operator
import time:       419 |        531 |         operator
import time:       129 |        129 |             itertools
import time:       325 |        325 |             keyword
import time:       468 |        468 |             reprlib
import time:       258 |        258 |             _collections
import time:       978 |       2156 |           collections
import time:        78 |         78 |           _functools
import time:       835 |       3068 |         functools
import time:      1359 |       5235 |       enum
import time:       138 |        138 |         _sre
import time:       497 |        497 |           sre_constants
import time:       528 |       1025 |         sre_parse
import time:       512 |       1674 |       sre_compile
import time:       109 |        109 |       _locale
import time:       886 |        886 |       copyreg
import time:       671 |       8574 |     re
import time:       471 |        471 |       warnings
import time:       330 |        801 |     importlib
import time:       906 |      10279 |   _distutils_hack

This almost doubles the startup time (from 13501 us to 24093 us) and unfortunately, as almost all Python installations will have the pth file installed by default, all Python installations will double its startup time.

We have been working very hard to reduce startup time with things like frozen modules (https://bugs.python.org/issue45020) and other optimizations and is unfortunate that this gains are being lost due to this.

See: https://bugs.python.org/issue46110 for more information and discussion

Expected behavior

The distutils hack doesn't impact startup time.

How to Reproduce

Measure the time spent in python -c pass. You can use python -Ximporttime -c pass to generate the import time table for convenience and debugging.

Output

Relevant output of python -X importtime -c pass:

...
import time:       278 |        278 |         types
import time:       112 |        112 |           _operator
import time:       419 |        531 |         operator
import time:       129 |        129 |             itertools
import time:       325 |        325 |             keyword
import time:       468 |        468 |             reprlib
import time:       258 |        258 |             _collections
import time:       978 |       2156 |           collections
import time:        78 |         78 |           _functools
import time:       835 |       3068 |         functools
import time:      1359 |       5235 |       enum
import time:       138 |        138 |         _sre
import time:       497 |        497 |           sre_constants
import time:       528 |       1025 |         sre_parse
import time:       512 |       1674 |       sre_compile
import time:       109 |        109 |       _locale
import time:       886 |        886 |       copyreg
import time:       671 |       8574 |     re
import time:       471 |        471 |       warnings
import time:       330 |        801 |     importlib
import time:       906 |      10279 |   _distutils_hack
...

Code of Conduct

pablogsal commented 2 years ago

CC: @brandtbucher @ericsnowcurrently @lpereira @markshannon @iritkatriel

jaraco commented 2 years ago

Thanks for reporting. I was aware this was a risk.

I do believe the riskslowdown is justifiable, even at the reported level of 2x (11ms), given that:

That said, there may be some performance optimizations that can be made, and probably should be made if they don't encumber the implementation too much.

jaraco commented 2 years ago

For importlib_metadata, I developed pytest-perf to exercise performance tests and prevent regressions. That could be employed here to (a) measure the performance and (b) prevent regressions as performance is improved.

tiran commented 2 years ago

You could reduce the impact on startup performance by not using contextlib and moving the import of re into the function body.

gvanrossum commented 2 years ago

In order to avoid importing re you would also have to remove the warnings.filterwarnings() call -- that function uses re. What does that call do anyway?

How temporary is the hack? Will it be gone by the time 3.11 final is released (coming October)?

gvanrossum commented 2 years ago

If you wanted to make us really happy, you could get rid of the filterwarnings call, and delay importing of warnings, re, importlib and contextlib. That avoids any imports in the default python -c pass case that aren't already frozen.

gvanrossum commented 2 years ago

In order to avoid importing re you would also have to remove the warnings.filterwarnings() call -- that function uses re. What does that call do anyway?

Answering my own question, this seems to get rid of the warning printed by importing distutils whenever _distutils_hack is imported. This could probably also be postponed, e.g. put this line in spec_for_distutils().

(@jaraco, would it be helpful if I submitted a PR?)

tiran commented 2 years ago

The warnings filter is problematic for two reasons. It not only slows down startup. It also hides Python core's deprecation warning from all users that have a recent version of setuptools. Since the distutils-precedence.pth file loads the _distutils_hack package always, the filter gets installed every time, too. Virtually everybody has setuptools installed, too. This makes core's deprecation warning in the stdlib pointless.

gvanrossum commented 2 years ago

The warnings filter is problematic for two reasons. It not only slows down startup. It also hides Python core's deprecation warning from all users that have a recent version of setuptools. Since the distutils-precedence.pth file loads the _distutils_hack package always, the filter gets installed every time, too. Virtually everybody has setuptools installed, too. This makes core's deprecation warning in the stdlib pointless.

Not quite. If you set SETUPTOOLS_USE_DISTUTILS=stdlib, _distutils_hack is not imported, and you will see the warning. However, I think your point remains -- a user who has old code that imports distutils will (likely) not get the warning when they have setuptools installed, even if they're not using setuptools. This seems to defeat the point of having the warning.

tiran commented 2 years ago

setuptools main branch

$ venv/bin/python3 -X importtime -c pass
import time: self [us] | cumulative | imported package
import time:       652 |        652 |   _io
import time:       142 |        142 |   marshal
import time:      1390 |       1390 |   posix
import time:      2513 |       4695 | _frozen_importlib_external
import time:       541 |        541 |   time
import time:       824 |       1364 | zipimport
import time:       612 |        612 |     _codecs
import time:      1788 |       2400 |   codecs
import time:      1677 |       1677 |   encodings.aliases
import time:      2176 |       6252 | encodings
import time:       925 |        925 | encodings.utf_8
import time:       402 |        402 | _signal
import time:       222 |        222 |     _abc
import time:       955 |       1177 |   abc
import time:      1119 |       2296 | io
import time:       213 |        213 |       _stat
import time:       824 |       1036 |     stat
import time:      2834 |       2834 |     _collections_abc
import time:       381 |        381 |       genericpath
import time:       576 |        957 |     posixpath
import time:      2629 |       7454 |   os
import time:       445 |        445 |   _sitebuiltins
import time:       479 |        479 |         types
import time:      1361 |       1839 |       enum
import time:       243 |        243 |         _sre
import time:       498 |        498 |           sre_constants
import time:       617 |       1115 |         sre_parse
import time:       501 |       1858 |       sre_compile
import time:       157 |        157 |           itertools
import time:       235 |        235 |           keyword
import time:       103 |        103 |             _operator
import time:       527 |        630 |           operator
import time:       278 |        278 |           reprlib
import time:       101 |        101 |           _collections
import time:      1841 |       3239 |         collections
import time:       197 |        197 |         _functools
import time:       804 |       4239 |       functools
import time:       109 |        109 |       _locale
import time:       204 |        204 |       copyreg
import time:       977 |       9225 |     re
import time:       352 |        352 |       warnings
import time:       223 |        575 |     importlib
import time:       679 |        679 |     contextlib
import time:       977 |      11455 |   _distutils_hack
import time:       167 |        167 |   sitecustomize
import time:      2255 |      21774 | site

with my fix

$ venv/bin/python3 -X importtime -c pass
import time: self [us] | cumulative | imported package
import time:       627 |        627 |   _io
import time:       156 |        156 |   marshal
import time:      1573 |       1573 |   posix
import time:      2355 |       4709 | _frozen_importlib_external
import time:       418 |        418 |   time
import time:       876 |       1293 | zipimport
import time:       520 |        520 |     _codecs
import time:      1872 |       2392 |   codecs
import time:      1831 |       1831 |   encodings.aliases
import time:      2339 |       6560 | encodings
import time:       704 |        704 | encodings.utf_8
import time:       289 |        289 | _signal
import time:       113 |        113 |     _abc
import time:       709 |        821 |   abc
import time:       730 |       1551 | io
import time:       142 |        142 |       _stat
import time:       472 |        614 |     stat
import time:      1925 |       1925 |     _collections_abc
import time:       282 |        282 |       genericpath
import time:       408 |        690 |     posixpath
import time:      1777 |       5005 |   os
import time:       358 |        358 |   _sitebuiltins
import time:       913 |        913 |   _distutils_hack
import time:       191 |        191 |   sitecustomize
import time:      2005 |       8471 | site

import of site went from 21774 to 8471.