pypa / pipenv

Python Development Workflow for Humans.
https://pipenv.pypa.io
MIT License
24.88k stars 1.87k forks source link

Pipenv lock still takes way too long to complete #4430

Closed mjpieters closed 4 years ago

mjpieters commented 4 years ago

pipenv lock is still taking up to 30 minutes to complete for my medium-sized project.

While #4403 was closed as fixed and indeed no full tree copies are made any more, pipenv lock is still unworkably slow. What used to take 50 seconds under 2018.11.26, now takes 30 minutes to figure out that nothing changed:

$ export PIPENV_INSTALL_TIMEOUT=10000
$ cp Pipfile.lock Pipfile.lock.before
$ time pipenv lock
Locking [dev-packages] dependencies…
Building requirements...
Resolving dependencies...
✔ Success!
Locking [packages] dependencies…
Building requirements...
Resolving dependencies...
✔ Success!
Updated Pipfile.lock (0e1198)!

real    29m19.701s
user    28m39.300s
sys 4m8.222s
$ diff -su Pipfile.lock.old Pipfile.lock
Files Pipfile.lock.old and Pipfile.lock are identical

This is making it untenable to keep using Pipenv.

Full pipenv lock -v output in a gist.

Pipfile:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]
flake8 = "*"
flake8-bugbear = "*"
black = "==19.10b0"
pre-commit = "*"
pytest = "*"
pytest-cov = "*"

[packages]
project-name = {editable = true,path = "."}
"zc.buildout" = "*"

[requires]
python_version = "3.6"

where project-name (redacted name) lists the following install-requires lines:

install_requires =
    apache-airflow[aws,celery,postgres,redis] >= 1.10.11
    airflow_multi_dagrun
    airflow-prometheus-exporter

$ pipenv --support Pipenv version: `'2020.8.13'` Pipenv location: `'/usr/local/Cellar/pipenv/2020.8.13/libexec/lib/python3.8/site-packages/pipenv'` Python location: `'/usr/local/Cellar/pipenv/2020.8.13/libexec/bin/python3.8'` Python installations found: - `3.9.0`: `/Users/mj/bin/python3.9` - `3.8.5`: `/usr/local/bin/python3` - `3.8.5`: `/usr/local/bin/python3.8` - `3.8.3`: `/Users/mj/bin/python3.8` - `3.7.7`: `/Users/mj/bin/python3.7` - `3.7.6`: `/opt/miniconda3/bin/python3` - `3.7.6`: `/opt/miniconda3/bin/python3.7m` - `3.7.6`: `/opt/miniconda3/bin/python3.7` - `3.7.3`: `/Users/mj/.pyenv/versions/3.7.3/bin/python3` - `3.6.10`: `/Users/mj/bin/python3.6` - `3.6.1`: `/Users/mj/bin/pypy3` - `3.5.9`: `/Users/mj/bin/python3.5` - `3.4.10`: `/Users/mj/bin/python3.4` - `3.3.7`: `/Users/mj/bin/python3.3` - `3.2.6`: `/Users/mj/bin/python3.2` - `2.7.18`: `/Users/mj/bin/python2.7` - `2.7.16`: `/usr/bin/python2.7` - `2.6.9`: `/Users/mj/bin/python2.6` - `2.5.6`: `/Users/mj/bin/python2.5` - `2.4.6`: `/Users/mj/bin/python2.4` PEP 508 Information: ``` {'implementation_name': 'cpython', 'implementation_version': '3.8.5', 'os_name': 'posix', 'platform_machine': 'x86_64', 'platform_python_implementation': 'CPython', 'platform_release': '17.7.0', 'platform_system': 'Darwin', 'platform_version': 'Darwin Kernel Version 17.7.0: Wed May 27 17:00:02 PDT ' '2020; root:xnu-4570.71.80.1~1/RELEASE_X86_64', 'python_full_version': '3.8.5', 'python_version': '3.8', 'sys_platform': 'darwin'} ``` System environment variables: - `PATH` - `AUTOJUMP_ERROR_PATH` - `NVM_INC` - `TERM_PROGRAM` - `GIT_PROMPT_END` - `rvm_bin_path` - `AUTOJUMP_SOURCED` - `GEM_HOME` - `NVM_CD_FLAGS` - `SHELL` - `TERM` - `HISTSIZE` - `TMPDIR` - `COPYFILE_DISABLE` - `IRBRC` - `Apple_PubSub_Socket_Render` - `CONDA_SHLVL` - `PERL5LIB` - `TERM_PROGRAM_VERSION` - `CONDA_PROMPT_MODIFIER` - `TERM_SESSION_ID` - `MY_RUBY_HOME` - `PERL_MB_OPT` - `LC_ALL` - `USER` - `HISTFILESIZE` - `NVM_DIR` - `COMMAND_MODE` - `CONDA_EXE` - `rvm_path` - `SSH_AUTH_SOCK` - `__CF_USER_TEXT_ENCODING` - `PTPIMG_API_KEY` - `_CE_CONDA` - `rvm_prefix` - `_` - `CONDA_PREFIX` - `PWD` - `HOMEBREW_GITHUB_API_TOKEN` - `EDITOR` - `LANG` - `ITERM_PROFILE` - `PYTHONSTARTUP` - `XPC_FLAGS` - `ITERM_ORIG_PS1` - `XPC_SERVICE_NAME` - `HISTCONTROL` - `_CE_M` - `FIGNORE` - `rvm_version` - `COLORFGBG` - `HOME` - `SHLVL` - `LC_TERMINAL_VERSION` - `ITERM_PREV_PS1` - `ITERM_SESSION_ID` - `PERL_LOCAL_LIB_ROOT` - `LOGNAME` - `CONDA_PYTHON_EXE` - `LC_CTYPE` - `GEM_PATH` - `GIT_PROMPT_ONLY_IN_REPO` - `CONDA_DEFAULT_ENV` - `NVM_BIN` - `DISPLAY` - `LC_TERMINAL` - `HGSRC` - `SECURITYSESSIONID` - `RUBY_VERSION` - `PERL_MM_OPT` - `COLORTERM` - `PIP_DISABLE_PIP_VERSION_CHECK` - `PYTHONDONTWRITEBYTECODE` - `PIP_SHIMS_BASE_MODULE` - `PIP_PYTHON_PATH` - `PYTHONFINDER_IGNORE_UNSUPPORTED` Pipenv–specific environment variables: Debug–specific environment variables: - `PATH`: `/usr/local/Cellar/pipenv/2020.8.13/libexec/tools:/usr/local/google-sdks/google-cloud-sdk/bin:/Users/mj/perl5/bin:/Users/mj/.nvm/versions/node/v12.14.1/bin:/Users/mj/.rvm/gems/ruby-2.3.0/bin:/Users/mj/.rvm/gems/ruby-2.3.0@global/bin:/Users/mj/.rvm/rubies/ruby-2.3.0/bin:/opt/miniconda3/bin:/opt/miniconda3/condabin:/Users/mj/bin:/Users/mj/Development/Library/zopatista_setup/bin:/Users/mj/Development/Library/zopatista_setup/audio:/usr/local/Cellar/ruby/2.7.1_2/bin:/Users/mj/.cargo/bin:/Users/mj/.poetry/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/MacGPG2/bin:/usr/local/sbin:/opt/X11/bin:/Applications/Postgres.app/Contents/Versions/latest/bin:/Users/mj/.local/bin:/Users/mj/.rvm/bin:/Applications/Visual Studio Code.app/Contents/Resources/app/bin` - `SHELL`: `/bin/bash` - `EDITOR`: `vim` - `LANG`: `en_GB.UTF-8` - `PWD`: `/Users/mj` ---------------------------
conradg commented 4 years ago

Thanks for raising this @mjpieters, thought I was going crazy. Downgrading to 2018.11.26 speeds things up massively for me too

witsch commented 4 years ago

Thanks for raising this @mjpieters, thought I was going crazy. Downgrading to 2018.11.26 speeds things up massively for me too

Yes, +1! Thanks for looking into this, Martijn! We had exactly the same issue (lots of big tmp directories), but sorta gave up for the time being… I couldn't really argue much when pipenv lock took 10+ minutes for simply adding another package.

(and hi, btw 👋)

mgmarino commented 4 years ago

A "workaround" that may work (and what we have used from time to time) is to remove the lock file and rerun the lock. This may also hint at the underlying problem, which may have something to do with how pipenv is using the existing lock file during the lock process.

Can-Sahin commented 4 years ago

Deleting the Pipfile.lock speeds things up for sure. I just install all the packages from scratch and run pipenv lock without changing anything and it gets stuck eventhough there is no change at all

frostming commented 4 years ago

Can you have a try on the master branch? I recently disabled the copying of the local source tree.

mjpieters commented 4 years ago

Can you have a try on the master branch? I recently disabled the copying of the local source tree.

Do you mean #4421? That's been part of the 2020.8.13 release that I tested with for this issue. I did note that the tree copying no longer takes place.

I also don't see any changes on master since 2020.8.13 was released that look relevant. Did I miss something?

frostming commented 4 years ago

If I remember correctly, 2018.11.26 didn't resolve transient dependencies of an editable package. You can compare the lock files generated by both versions.

mjpieters commented 4 years ago

If I remember correctly, 2018.11.26 didn't resolve transient dependencies of an editable package. You can compare the lock files generated by both versions.

I see no differences that look problematic. With 2020.8.13.dev0 (master branch) I see markers entries added. The dependencies I see added are ones that were due to other dependencies having been upgraded (I deleted the lock file to see if there was much speed difference).

With the lock file deleted and using 2020.8.13.dev0, a pipenv lock took 10 minutes.

frostming commented 4 years ago

I removed all other dependencies except the editable one. It turns out no significant difference is seen in the time costs of 2018.11.26 and 2020.8.13. It will be great if you can narrow down which package brings the performance degrading.

I created a demo repo to showcase the issue, which you can clone and adjust to reproduce the performance issue: https://github.com/frostming/pipenv-lock-performance

$ docker build . --build-arg PIPENV_VER=2020.8.13 -t pipenv-bug
$ docker build . --build-arg PIPENV_VER=2018.11.26 -t pipenv-bug
mjpieters commented 4 years ago

Sorry, I meant to get back to this but current project work is keeping me over-utilised. I do still find Pipenv to be excruciatingly slow, and regularly have to delete the lock file just to get a solution, but that's not really very actionable feedback.

alexandrerays commented 3 years ago

I'm having this problem in February 2021. Even deleting lock file, it is still very slow.

tombohub commented 3 years ago

Same here, deleting lock file doesn't help

GiuseppeLaurenza commented 3 years ago

same here, there are any news?

Eortvald commented 3 years ago

same here

Real-Gecko commented 3 years ago

Version 2020.11.15, does not take half an hour for me, but couple of minutes is too much for creating 40 KiB JSON file.

chialunwu commented 3 years ago

Why pipenv why so slow? Why?

Nytelife26 commented 3 years ago

This issue is still present in April of 2021. I have yet to look into why this is, but I know it doesn't have to be this way because there are other managers that work better and faster.

For instance, I recommend using Poetry until this is fixed. I know it seems counter-intuitive and disrespectful to recommend another project, but as of the present moment, the speed of pipenv makes it unusable, and I'd rather put an alternative on the table than leave people to wait.

Real-Gecko commented 3 years ago

I did not investigate the code, but I think it happens because on every run Pipenv checks for updates even though no one asks for this, thus it makes a lot of queries, calculates a lot of hashes for Pipfile.lock etc. So the bigger your .venv is the worse things get. For example right now I installed one package, but instead of just installing it and it's dependency Pipenv also updated three packages. As I already had 101 packages installed you can imagine the process was quite time consuming.

image

CharString commented 3 years ago

On 22 April 2021 02:53:29 CEST, Tyler J Russell @.***> wrote:

This issue is still present in April of 2021. I have yet to look into why this is, but I know it doesn't have to be this way because there are other managers that work better and faster.

For instance, I recommend using Poetry until this is fixed.

Or have a look at pdm. You can make it use venvs instead of pypackages, if you want. And it can import your Pipfile.

Nytelife26 commented 3 years ago

Or have a look at pdm. You can make it use venvs instead of pypackages, if you want. And it can import your Pipfile.

Interesting. I'll take a look later. I like Poetry, though, because it addresses another issue I noticed with pipenv - it uses a standard from PEP 518 that's likely to work with other toolchains (pyproject.toml as opposed to Pipfile).

Nytelife26 commented 3 years ago

image

Interestingly enough, that's actually the same with package managers from other ecosystems, npm, yarn, etc, all update the lockfile when you install the project unless you specify otherwise. They shouldn't, but it proves a point - it doesn't have to be this painfully slow at all.

Real-Gecko commented 3 years ago

The biggest problem IMO is this spinner: image It's not really informative, one cannot tell exactly what's exactly going on right now. What if I have network problems and Pipenv experiences request timeouts? I wouldn't know, because all progress is hidden behind spinner.

felix-ht commented 3 years ago

just move to poetry - faster locks overall better performance

Nytelife26 commented 3 years ago

just move to poetry

That's not really constructive at all. It's better to work on helping people here. I only suggested it as a temporary solution so people don't have to deal with the hassle while they wait for a fix.

You wouldn't walk into Google's headquarters and tell everyone to "just use DuckDuckGo", would you? It's basic respect.

Nytelife26 commented 3 years ago

I'll be running tests later. I think the best way to resolve this is to time each stage of the lock and see which ones take the longest so we can determine where the problem comes from and attempt to fix it.

Real-Gecko commented 3 years ago

I bet it'll be network, sending requests to check for updates, checking dependencies for updates, downloading updates etc. Maybe if output would be more verbose we would pinpoint problem quicker.

mgmarino commented 3 years ago

I'll be running tests later. I think the best way to resolve this is to time each stage of the lock and see which ones take the longest so we can determine where the problem comes from and attempt to fix it.

@Nytelife26 A thanks-in-advance for this. As this issue is closed, I think it'd be helpful to perhaps post the results of your efforts in one or more new issues. Here's hoping you'll find something actionable that the maintainers (or we in the community) can do to improve the situation.

Nytelife26 commented 3 years ago

I bet it'll be network

I do highly doubt that. Other package management software for other ecosystems do the same thing, but lightyears faster. Unless pipenv uses a dinosaur of a networking system, it seems unlikely.

if output would be more verbose we would pinpoint problem quicker The output alone isn't useful - we need to know the timing. We do already have a sample of verbose output, though, from earlier on in this issue.

Nytelife26 commented 3 years ago

A thanks-in-advance for this.

That's absolutely fine. Always happy to help.

post the results [...] in one or more new issues.

Good idea, unless we can attract maintainer attention to get this reopened.

CharString commented 3 years ago

but lightyears faster.

When we report the timing in our verbose output, we should use units of time; lightyears are a unit of distance (as are parsecs Han Solo)

Real-Gecko commented 3 years ago

but lightyears faster.

When we report the timing in our verbose output, we should use units of time; lightyears are a unit of distance (as are parsecs Han Solo)

If you have nothing constructive to say, it's better to stay quiet.

Nytelife26 commented 3 years ago

If you have nothing constructive to say, it's better to stay quite.

No need to be rude. I appreciated the humour honestly - we're engineers, not lawyers, we're allowed to be funny. You'd love some of the original UNIX design documents, developers have a real history of messing around.

Anyways, yes, I'll be reporting the timings here in seconds, not lightyears.

requirements.txt found, instead of Pipfile! Converting...
✘ Failed... 
Warning: Your Pipfile now contains pinned versions, if your
requirements.txt did. 
We recommend updating your Pipfile to specify the "*" version, instead.
Pipfile.lock not found, creating...
start - 7.152557373046875e-07
keep outdated - 8.58306884765625e-05
cleanup lockfile - 0.0009396076202392578
inherit dev - 0.009583711624145508
Locking [dev-packages] dependencies...
Locking [packages] dependencies...
Building requirements...
Resolving dependencies...
✔ Success! 
resolve deps - 13.933054447174072
Updated Pipfile.lock (ac73fc)!
overwrite dev - 13.93443512916565
Installing dependencies from Pipfile.lock (ac73fc)...
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 38/38 — 00:00:15

For some strange reason it isn't as slow on my machine anymore, only around 30 seconds. However, the bulk of the time is most definitely spent resolving and installing dependencies (in my testing, anyone else should feel free to verify this on a machine where performance is still poor) - and I'm not sure how easy it'll be to improve that.

Real-Gecko commented 3 years ago

Here's the Pipfile from my current project:

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
boto3 = "*"
celery = "*"
django = "*"
django-allauth = "*"
django-bootstrap4 = "*"
django-environ = "*"
django-filter = "*"
django-ical = "*"
django-icons = "*"
django-import-export = "*"
django-logentry-admin = "*"
django-menu-generator = "*"
django-modeltranslation = "*"
django-money = "*"
django-pandas = "*"
django-phonenumber-field = "*"
django-redis = "*"
django-rest-auth = "*"
django-reverse-admin = "*"
django-storages = "*"
django-suit = {editable = true, git = "https://github.com/darklow/django-suit.git"}
django-tables2 = "*"
django-vies = "*"
django-widget-tweaks = "*"
djangorestframework = "*"
drf-yasg = "*"
fcm-django = "*"
gunicorn = "*"
monthdelta = "*"
pandas = "*"
phonenumbers = "*"
pillow = "*"
psycopg2-binary = "*"
pymsteams = "*"
redis = "*"
rules = "*"
slack-sdk = "*"
whitenoise = "*"

[dev-packages]
argh = "*"
bpython = "*"
coverage = "*"
django-debug-toolbar = "*"
django-extensions = "*"
flower = {editable = true, git = "https://github.com/mher/flower.git"}
watchdog = "*"

[requires]
python_version = "3.8"

In the beginning it was relatively fast too, but as the number of packages grew locking became slower. Currently it's around ~4 minutes.

tombohub commented 3 years ago

I am using WSL on windows and it was taking literally half an hour or so to finish with locking

Real-Gecko commented 3 years ago

How can I run Pipenv with timings?

Nytelife26 commented 3 years ago

How can I run Pipenv with timings?

You have to modify the do_lock function in core.py in the root package directory to print a time.time() between calls.

It's a bit of a hacky solution, but as far as I'm aware pipenv has no built in profiling.

mgmarino commented 3 years ago

pipenv spawns a number of subprocesses during the lock, does this take any of that into account?

edit I have found this is also why a lot of the output is hidden during verbose running; the stdout/stderr of the worker processes are not propagated up.

Nytelife26 commented 3 years ago

pipenv spawns a number of subprocesses during the lock, does this take any of that into account?

It'll time exactly as long as each step takes to complete. Does it run the do_lock in worker threads, or spawn the worker threads from do_lock? I couldn't tell when I looked.

Real-Gecko commented 3 years ago

I've found the way to profile pipenv install, take a look at this repo for instructions. The result is *.calltree file which can be opened kcachegrind. I totally do not understand how to interpret this results, but I'm sure someone can figure it out.

image

Nytelife26 commented 3 years ago

I've found the way to profile pipenv install

Marvellous. Thank you.

I'll investigate later - although, at a glance, the presence of time.sleep is interesting.

mjpieters commented 3 years ago

I'll investigate later - although, at a glance, the presence of time.sleep is interesting.

I'm pretty sure that those are inside pexpect, used to monitor each pip process spawned. It just means that there is a lot of waiting around for those processes.

mjpieters commented 3 years ago

Yup, it's the vendored pexpect wait loop. expect_loop() function definition:

https://github.com/pypa/pipenv/blob/275f7e151eb0aa17702215165a371df7da9ad476/pipenv/vendor/pexpect/expect.py#L153-L184

and self.spawn.delayafterread is set to 0.0001 to release the GIL:

https://github.com/pypa/pipenv/blob/275f7e151eb0aa17702215165a371df7da9ad476/pipenv/vendor/pexpect/spawnbase.py#L72-L77

Note that there's no point in setting it to None here, as it is not pexpect that is the problem. It's the pip subprocess, which hasn't been profiled.

gshpychka commented 3 years ago

Help me understand - is this acknowledged and are there plans to address this?

gaetangr commented 3 years ago

I have implemented Pipenv with my team recently, but the time it take to install even one package is crazy, I tried with panda, it took between 5mn to 8mn.

41 packages takes around 20mn to install.

I have tried the following :

Nothing work so far

Luca-Terrazzan commented 3 years ago

Same problem for me; locking is so slow that we are currently within the very dangerous situation of avoiding to update the lockfile whenever possible to avoid losing hours of waiting times

trallnag commented 3 years ago

To all the people thinking that Poetry is the answer and is perfectly fast in all cases: You are wrong. Both Pipenv and Poetry are having these issues partly due to the nature how dependency management works / does not work in Python.

Real-Gecko commented 3 years ago

To all the people thinking that Poetry is the answer and is perfectly fast in all cases: You are wrong. Both Pipenv and Poetry are having these issues partly due to the nature how dependency management works / does not work in Python.

Totally agree, with lot's of dependencies it's slow too.

khimaros commented 3 years ago

anecdotally, switching my medium sized project to poetry solved all of my performance issues.

FlavourDave92 commented 3 years ago

Same problem for me; locking is so slow that we are currently within the very dangerous situation of avoiding to update the lockfile whenever possible to avoid losing hours of waiting times

Same, this simply feels wrong but often locking just takes ages…

sommelon commented 3 years ago

I am using WSL on windows and it was taking literally half an hour or so to finish with locking

I'm also trying to use pipenv lock in the WSL and it takes much much longer compared to locking in Windows. Don't know if it's a problem with, WSL, ubuntu or pipenv or something else.