adang1345 / delvewheel

Self-contained Python wheels for Windows
MIT License
123 stars 12 forks source link

CI PyPI version Python versions

delvewheel

delvewheel is a command-line tool for creating self-contained Python wheel packages for Windows that have DLL dependencies that may not be present on the target system. It is functionally similar to auditwheel (for Linux) and delocate (for macOS).

Suppose that you have built a Python wheel for Windows containing an extension module, and the wheel depends on DLLs that are present in the build environment but may not be present on the end user's machine. This tool determines which DLLs a wheel depends on (aside from system libraries) and copies those DLLs into the wheel. This tool also takes extra steps to avoid DLL hell and to ensure that the DLLs are properly loaded at runtime.

Installation

delvewheel can be installed using pip.

pip install delvewheel

You can also install from the source code by opening a command-line shell at the repository root and running

pip install .

Supported Platforms

delvewheel can be run using Python 3.8+ on any platform.

delvewheel can repair wheels targeting Python 2.6+ for win32, win_amd64, or win_arm64.

The environment used to run delvewheel does not need to match the target environment of the wheel being repaired. For example, you can run delvewheel using 32-bit Python 3.8 to repair a wheel for 64-bit Python 2.6. You can even run delvewheel with PyPy3.6 on 32-bit x86 Linux to repair a wheel whose target environment is CPython 3.11 on Windows arm64.

Usage

delvewheel show: show external DLLs that the wheel depends on

delvewheel repair: copy external DLL dependencies into the wheel and patch the wheel so that these libraries are loaded at runtime

delvewheel needed: list the direct DLL dependencies of a single executable

delvewheel uses the PATH environment variable to search for DLL dependencies. To specify an additional directory to search for DLLs, add the location of the DLL to the PATH environment variable or use the --add-path option.

For a summary of additional command-line options, use the -h option (delvewheel -h, delvewheel show -h, delvewheel repair -h, delvewheel needed -h).

Additional Options

The path separator to use in the following options is ';' on Windows and ':' on Unix-like platforms.

delvewheel show

delvewheel repair

Version Scheme

Semantic versioning is used.

Name Mangling

This section describes in detail how and why delvewheel mangles the vendored DLL filenames by default. It is fairly technical, so feel free to skip it if it's not relevant to you.

Suppose you install two Python extension modules A.pyd and B.pyd into a single Python environment, where the modules come from separate projects. Each module depends on a DLL named C.dll, so each project ships its own C.dll. Because of how the Windows DLL loader works, if A.pyd is loaded before B.pyd, then both modules end up using A.pyd's version of C.dll. Windows does not allow two DLLs with the same name to be loaded in a single process (unless you have a private SxS assembly, but that's a complicated topic that's best avoided in my opinion). This is a problem if B.pyd is not compatible with A.pyd's version of C.dll. Maybe B.pyd requires a newer version of C.dll than A.pyd. Or maybe the two C.dlls are completely unrelated, and the two project authors by chance chose the same DLL name. This situation is known as DLL hell.

To avoid this issue, delvewheel renames the vendored DLLs. For each DLL, delvewheel computes a hash based on the DLL contents and the wheel distribution name and appends the hash to the DLL name. For example, if the authors of A.pyd and B.pyd both decided to use delvewheel as part of their projects, then A.pyd's version of C.dll could be renamed to C-a55e90393a19a36b45c623ef23fe3f4a.dll, while B.pyd's version of C.dll could be renamed to C-b7f2aeead421653280728b792642e14f.dll. Now that the two DLLs have different names, they can both be loaded into a single Python process. Even if only one of the two projects decided to use delvewheel, then the two DLLs would have different names, and DLL hell would be avoided.

Simply renaming the DLLs is not enough, though because A.pyd is still looking for C.dll. To fix this, delvewheel goes into A.pyd and finds its import directory table, which tells the Windows loader the names of the DLL dependencies. This table contains an entry with a pointer to the string "C.dll", which is embedded somewhere in A.pyd. delvewheel then finds a suitable location in A.pyd to write the string "C-a55e90393a19a36b45c623ef23fe3f4a.dll" and edits the import directory table entry to point to this string. Now, when A.pyd is loaded, it knows to look for C-a55e90393a19a36b45c623ef23fe3f4a.dll.

So far, we have described the simplest possible example where there exists one Python extension module with one DLL dependency. In the real world, DLL dependency relationships are often more complicated, and delvewheel can handle them as well. For example, suppose a project has the following properties.

delvewheel would execute the following when name-mangling.

Limitations