Open shubhbapna opened 2 days ago
It might look like a micro-optimization but we run this command for almost every dependency, so the gains add up
wheel unpack
does more than just unzipping the wheel. It also validates the integrity of a wheel by comparing the files against the recorded checksums and it corrects file permissions.
If you are worried about process spawn overhead, then it would be better to reimplement unpacking and packing on top of wheel.wheelfile.WheelFile
. It has a stable API.
To check the integrity of the wheel file, lets add it to the _download_wheel_check which currently just checks whether the wheel file is a valid zip file. This handles the case for when we are downloading a wheel from a cache server. I will open a separate ticket for that. #512
For wheel unpack
command that we run right after building a wheel to add extra metadata, it might be safe to assume that the wheel is valid since fromager just built it. For packing we are still using wheel pack
command
That makes sense, thanks!
This leaves the issue with extractall
not setting file permissions like executable bit. That's a simple fix.
This leaves the issue with
extractall
not setting file permissions like executable bit. That's a simple fix.
Ahh I see. This would be the fix right: https://github.com/pypa/wheel/blob/main/src/wheel/cli/unpack.py#L21 opened #513
Currently the cache wheel feature in bootstrap uses the wheel command to unpack the wheel and extract the build requirements. Using the wheel command requires spawning a child process which is expensive compared to using the zipfile library. There is almost a 90% difference when running this script: