indygreg / PyOxidizer

A modern Python application packaging and distribution tool
Mozilla Public License 2.0
5.48k stars 239 forks source link

Consider rebuilding distributions when `pip` v21.3 comes out; "in-tree-build" notably faster #416

Open EndilWayfare opened 3 years ago

EndilWayfare commented 3 years ago

TL;DR - Passing --use-feature=in-tree-build to pip (via export PIP_USE_FEATURE=in-tree-build before pyoxidizer build) speeds up builds (especially if directory containing setup.py has many small files) because pip doesn't have to copy your local package to a temporary directory. This behavior will become the default in pip 21.3, so it would be nice to have that version bundled with the official pyoxidizer Python distributions.

Iterating on a pyoxidizer.bzl config, building frequently, each build took ~2m20s (using system Rust and already-cached PyPI packages) which felt like a pretty big speedbump. I noticed that it would often pause for a considerable amount of time on "Processing {path-to-local-package}". I already had my dependencies specified in a setup.py before adding pyoxidizer, so exe.add_python_resources(exe.pip_install([CWD])) felt most natural. I tried explicitly adding the dependencies explicitly in the .bzl instead, followed by an exe.read_package_root for the local package; maybe I should switch to a requirements.txt or something if there's significant savings? That dropped the build time to ~1m30s.

I noticed, though, that there was a deprecation notice right before the big "Processing" pause. It said that a future version of pip will build local packages in place without copying to a temporary directory, and that you can preview that behavior with --use-feature=in-tree-build. I tried that, with the original exe.pip_install([CWD]) form, and it dropped the build time further to ~1m15s. (The local package is very simple, so it's possible that setuptools can make assumptions that the general-purpose read_package_root can't?) I suppose that the standard pip behavior wanted to copy the entire repo to a temporary folder, since setup.py is at the root, and that includes all my Rust, all my test data and other artifacts, and the entire virtualenv I use for non-oxidized development and testing! That's silly, especially because there are a lot of small files, so I'm glad there's already a way to work around this.

I don't know how characteristic my setup is of pyoxidizer users, but I can imagine that it's one of the more ergonomic ways to oxidize an existing Python project that you control the source of.

I also don't know what the release cadence of the pyoxidizer Python distributions is, but I figured I should point this out on the off-chance you feel like updating them. Regardless, I hope this info can help some other people out!

indygreg commented 3 years ago

I didn't know about this in-tree-build feature: thanks for letting me know!

I update the Python distributions in response to new releases. Pip is updated routinely, so the distributions will get pip 21.3 when it is released. (The repository currently has 21.1.3 but the builds haven't been synchronized to PyOxidizer yet.)

Using exe.pip_install([CWD]) to install/collect resources for your local Python package is the preferred mechanism for doing so, as it goes through Python's packaging tools. Contrast with read_package_root or read_virtualenv, which simply walk the filesystem and may not account for architecture/platform differences. See https://pyoxidizer.readthedocs.io/en/stable/pyoxidizer_packaging_python_files.html#choosing-which-packaging-method-to-call for more on this topic.

Given the performance overhead until in-tree-build is the default, I'm wondering if we should consider making it the default? I just don't know if it is stable enough yet. With a lot of these features, bugs aren't found until it ships enabled by default. So it is probably safer to leave it disabled.

Note that the Starlark pip_install function takes arguments to pass to pip install. So you could do something like exe.pip_install(["--use-feature", "in-tree-build", CWD]) to enable the feature and get speed gains.

EndilWayfare commented 3 years ago

Note that the Starlark pip_install function takes arguments to pass to pip install. So you could do something like exe.pip_install(["--use-feature", "in-tree-build", CWD])

👍