Open pradyunsg opened 7 years ago
I did notice one can save about 80ms by using the option to disable the version check - it'd probably make sense not to do that at all for completions by default. Still, a lot remains to be improved.
I wonder how much of this time is taken up by importing stuff.
$ time .tox/py36/bin/python -c "from pip._internal import main"
A rudimentary test of running the above 5 times gives me an average of 0.377s.
Importing pip._vendor.pkg_resources
is what's taking most of the time on my machine. They are a few changes in setuptools 36.4 and above that will help a little: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#v3640
master
is vendoring setuptools 36.4.0 currently...
Yes, and it's faster than pip 9.0, but there are some further changes in setuptools 36.5 that might help: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#v3650.
python -c 'import pip._vendor.pkg_resources' 0.35s user 0.03s system 100% cpu 0.385 total
python -c "from pip._vendor import pkg_resources" 0.21s user 0.02s system 99% cpu 0.233 total
python -c "from pip._vendor import pkg_resources" 0.19s user 0.00s system 99% cpu 0.192 total
Oh, nice. That means the next round of vendor updates would bring some speedup. :)
I, personally, am waiting on a new distlib release before giving the vendored libraries another round of updates.
I fired up the profiler and ran pip completion --fish
. Here's what I got (all percentages in terms of total time):
pip._internal.__init__
: 85%
pip._internal.cmdoptions.__init__
: 79%pip._internal.index
: 78% (basically all of the above time is spent in this import)
pip._vendor.html5lib
: 15.8%pip._vendor.requests
: 9.0%pip._vendor.distlib
: 5.2%pip._vendor.packaging
: 7.1%pip._internal.download
: 35.4%pip._internal.utils.logging
: 30.4%
pip._internal.utils.misc
: 29.6%pip._vendor.pkg_resources
: 28.8%pip._internal.__init__.main()
: 15%
parseopts()
: 0.19%command.main()
: 14.7%PS: Need better profiling tools.
Lazy importing will probably solve some of those.
There are plans to (hopefully) make lazy importing easy to switch on for CLI apps like pip in Python 3.7. There is also now a -X importtime
argument to CPython 3.7 as well as dtrace/systemtap support to help track where import time is going to help profile this sort of thing.
@benoit-pierre
master+updated pkg_resources: python -c "from pip._vendor import pkg_resources" 0.19s user 0.00s system 99% cpu 0.192 total
What does "updated pkg_resources" mean? On my machine the initial import is still quite slow even though I have setuptools 38.1.0, so your comment seems very interesting to me! :P
@boxed: it means with pip's vendored version of setuptools updated.
Aha. I tried copying over pkg_resources from my main install over the one inside pip/_vendor, but I didn't see any difference in speed :/
Using CPython 3.7.0's -X importtime
.
import time: self [us] | cumulative | imported package
[snip]
import time: 654 | 506881 | pip._internal
[snip]
Just in case y'all are unaware, pkg_resources is tracking the slowness in https://github.com/pypa/setuptools/issues/510, I didn't see it linked in this issue yet.
Could you not move the imports from the top of the file to the function, that needs it?
@csdummi I have a PR that does this. It helps somewhat.
Could I have a link?
Here is a PR that improves the import situation for the vcs imports: https://github.com/pypa/pip/pull/6545 It removes the pip._internal.vcs
imports from pip/internal/__init__.py
. This will make it easy to remove vcs imports from the common case, if desired, which can be done in a subsequent commit.
Is an external CLI framework an option?
@CSDUMMI What do you mean by external CLI framework?
@csdummi I'd say no, not at this time at least. But I do have some ideas for that at https://github.com/boxed/p but it's way too early for that.
@pradyunsg I did such a thing a few weeks ago: https://github.com/CSDUMMI/cli_framework, this is not optimized of course. But I think of something like this: https://oclif.io/
@boxed Why is it not yet appropriate?
@CSDUMMI I think both oclif and your code have the wrong idea, where git has the right idea (for this purpose at least). OCLIF is the more egregious to me when they say "it's open!" but then it's JS only :( My project is a general dispatcher/configger thingie that is language agnostic.
I don't think it's appropriate yet because there is no existing solid such system. We should absolutely build one, but we should build it separately and use it to wrap pip, pytest, venv, etc to make a nice python development workflow and when it's ready we can convince people to use it.
And how do you think should we build such a thing? What is the right idea git has?
@boxed I have to say your idea in p is very good, but your code needs more comments and documentation in general.
It needs to be used and polished most of all I think. But yea docs is also super important.
@boxed @CSDUMMI Could you please move this conversation elsewhere? It's unrelated to this issue's topic.
But what is wrong with my project? What is the right idea git has?
@pradyunsg I think so too. That's why I want to know why we shouldn't move the implementation of the CLI optimization to another place, too!
Let's move the discussion to https://github.com/boxed/p/issues/4
FYI, PR #6694 ("Only import a Command class when needed") was recently merged, which helps with this.
I posted PR #6835 to help with this.
I just posted PR #6843 to continue the work in PR #6835. The PR trims unneeded imports by making it so that commands not requiring downloading / PackageFinder
will no longer import that machinery.
I noticed a pretty significant slowdown in the latest released version -- I've tracked it down to here -- might be worth bumping pyparsing
once that gets resolved: https://github.com/pyparsing/pyparsing/issues/362
I also looked into pip startup time a bit. First thing I noticed is tenacity
's import of asyncio
but seems like @ichard26 already took care of it (thanks!).
Another one I noticed is chardet
import. requests
supports either chardet
or charset_normalizer
. From a quick experiment I did replacing the chardet
vendor import with non-vendored charset_normalizer
import, I get ~27ms for chardet
vs. ~7ms for charset_normalizer
, using -X importtime
on my admittedly ~10 years old laptop.
If there is interest I can try to prepare a PR to replace the charset
vendor with charset_normalizer
vendor.
Note that we can only vendor pure Python libraries. charset_normalizer
would at the very least be tricky to vendor because we'd need to find a way to make our vendoring tools ignore the platform-specific wheels. Also, are your benchmarks using the pure Python version? If not, they would need to be re-done to be meaningful.
I don't have any strong opinions on whether we should switch, I'm just noting these points as things to consider if we do.
Hmm the charset_normalizer
github repo tagline says "in pure python" but I guess it's not :)
I just tried again with charset_normalizer-3.3.2-py3-none-any.whl and still get ~7ms so the pure python version looks good as well.
charset_normalizer
was added as an optional dependency to requests
when the apache-airflow
team found the license for chardet
wasn't suitable for them.
The developer did a lot to make it more acceptable to the requests
maintainers, such as significantly reducing the amount of dependencies. My understanding is charset_normalizer
only has binaries based on compiling pure Python code with mypyc
, and isn't shipped by default.
If the developer is still as accommodating, I'd imagine pip would benefit in performance and ease of maintenance with charset_normalizer
, but the first thing I would do is check with the developer that they are happy being vendored by pip.
OK, submitted PR #12638.
BTW, another big import-time hit is packaging
-> pyparsing
package. I see that packaging
has already replaced pyparsing
with a hand-rolled parser, and there is a PR #12300 to update pip to use it. I checked the import time with #12300 and it's indeed a nice improvement.
With pyparsing, asyncio and chardet gone it will be a decent improvement to startup time. The remaining big ones are rich
and requests
/urllib3
but I don't think there is much to do about these for pip install
.
There's a lot to gain from speeding up pip's startup time.
For one, pip takes around 600ms to just print the completion text, which is laggy. (as mentioned in #4755). Further, faster startup time might help with the test-suite situation too.