Closed PiotrCzapla closed 1 month ago
I've asked guys at repairwheel for advice if this approach is ok or not. https://github.com/jvolkman/repairwheel/issues/42
Windows build is complaining about wrong version of runtime libraries but I hope it is only an issue if we would redistribute the wheels.
fixed wheel written to C:\Users\RUNNER~1\AppData\Local\Temp\repairwheelhb83b115\pywhispercpp-1.2.0-cp38-cp38-win_amd64.whl
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:460: UserWarning: _pywhispercpp.cp38-win_amd64.pyd was built with a newer platform toolset (14.41) than the discovered msvcp140.dll (14.15). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:460: UserWarning: _pywhispercpp.cp38-win_amd64.pyd was built with a newer platform toolset (14.41) than the discovered vcruntime140_1.dll (14.28). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:460: UserWarning: whisper.dll was built with a newer platform toolset (14.41) than the discovered msvcp140.dll (14.15). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:460: UserWarning: whisper.dll was built with a newer platform toolset (14.41) than the discovered vcruntime140_1.dll (14.28). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:460: UserWarning: ggml.dll was built with a newer platform toolset (14.41) than the discovered msvcp140.dll (14.15). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-72a_zsse\overlay\Lib\site-packages\delvewheel\_dll_utils.py:[460](https://github.com/abdeladim-s/pywhispercpp/actions/runs/11386200589/job/31677837180?pr=75#step:6:461): UserWarning: ggml.dll was built with a newer platform toolset (14.41) than the discovered vcruntime140_1.dll (14.28). This may cause compatibility issues.
warnings.warn(f'{os.path.basename(lib_path)} was built with a newer platform toolset ({linker_version}) than the discovered {os.path.basename(dll_info[0])} ({vc_redist_linker_version}). This may cause compatibility issues.')
Wrote C:\Users\runneradmin\AppData\Local\Temp\repaired_wheel_h7c34kmw\pywhispercpp-1.2.0-cp38-cp38-win_amd64.whl
Can you check if the test are working on windows?
Thanks @PiotrCzapla for the time you put into this. It took me a substantial amount of time as well to find a solution to this issue, but unfortunately I couldn't find any documentation how to do it properly, also as you noticed I use cbuildhweel to repair the wheel in CI but didn't work unfortunately. I decided at the end to load the libs manually as many Python packages are doing already.
I don't have Windows to give this a test but I will try to test on my Linux machine, even though I can see that CI failed on Linux.
But I have a question first, what's the issue with loading the libs manually during the import ? Did you find any issues with that ?
pip install .
should work because the libs are moved to the libs folder and are loaded during runtime, even if you delete the build folder it should still find the libs!
Hey @abdeladim-s! Thanks for running the build on cbuildwheel. It looks like there’s a little hiccup with repairwheel since it’s using the wrong platform value for auditwheel. We can fix this in cibuild by providing a custom command for this specific platform. Do you have a way to run the CI locally? I’d love to experiment with it.
This isn’t the first time I’ve run into rpath issues on Mac, and this time I decided to dig deeper and tackle it in a more standard way instead of just adding paths to DYLD_LIBRARY_PATH. It’s been quite a journey, but I hope this proper fix will help others facing similar problems down the line. While the approach might not necessarily be better, manually loading libraries can make the code fragile (like when whisper.cpp adds another library, such as coreml), and it can be confusing for those who aren’t library maintainers but understand how dynamic libraries work.
I genuinely hope that once we properly test this pr and resolve the issues, everything will run smoothly without needing to revisit it in the future.
To give you an idea of what it’s like trying to build the library for the first time:
pip install .
, I got a bunch of warnings about library collisions in with build temporary directory. After cleaning the build, it started crashing because it couldn’t load whisper due to the missing coreml library. :/python tests/test_model.py
, which was the only thing that worked predictibly.In the PR, I’m trying to make it either work or break with clear compilation failure like you have on CI in Linux, which should be easier to fix with a quick Google search.
After you decide on whether to repair the wheels, I’d like to add a couple of things:
pip install -e .
, using DYLD_LIBRARY_PATH or links to the build folder, so that changes to c++ would be quicker.It looks like the build is failing because the CI is attempting to create a wheel for the musllinux platform, like Alpine, while using Ubuntu, which only supports glibc builds, specifically manylinux platforms.
Have a look at this message: --plat has an invalid choice: 'musllinux_1_1_x86_64' from the environment variable 'AUDITWHEEL_PLAT'. The valid options are 'linux_x86_64', 'manylinux_2_5_x86_64', 'manylinux_2_12_x86_64', 'manylinux_2_17_x86_64', 'manylinux_2_24_x86_64', 'manylinux_2_27_x86_64', 'manylinux_2_28_x86_64', 'manylinux_2_31_x86_64', 'manylinux_2_34_x86_64', 'manylinux_2_35_x86_64', 'manylinux1_x86_64', 'manylinux2010_x86_64', and 'manylinux2014_x86_64'.
Auditwheel only supports musllinux when it is run on alpine linux have a look on the output of this command on python:3.11-alpine on mac M1 (aarch64): $ AUDITWHEEL_PLAT=trigger_error auditwheel repair something ... argparse.ArgumentError: argument --plat: invalid choice: 'trigger_error' from environment variable 'AUDITWHEEL_PLAT' (choose from 'linux_aarch64', 'musllinux_1_1_aarch64')
It appears that pywhisper didn't work on Alpine Linux before, even though it seemed like it did. Releaed version of pywhispercpp-1.2.0 doesn't support Alpine Linux as attempting to install it there leads to a build and then an error. The build fails because whisper.cpp source code are not attached.
Here's how it looks:
✗ docker run -it -v "$(pwd)":/project -w /project python:3.11-alpine /bin/sh
/project # apk add --no-cache gcc g++ cmake make ninja
/project # pip install pywhispercpp
Collecting pywhispercpp
Downloading pywhispercpp-1.2.0.tar.gz (234 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 234.3/234.3 kB 4.6 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy (from pywhispercpp)
Downloading numpy-2.1.2-cp311-cp311-musllinux_1_2_aarch64.whl.metadata (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.0/62.0 kB 8.8 MB/s eta 0:00:00
Collecting pydub (from pywhispercpp)
Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting requests (from pywhispercpp)
Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting tqdm (from pywhispercpp)
Downloading tqdm-4.66.5-py3-none-any.whl.metadata (57 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 kB 18.8 MB/s eta 0:00:00
Collecting platformdirs (from pywhispercpp)
Downloading platformdirs-4.3.6-py3-none-any.whl.metadata (11 kB)
Collecting charset-normalizer<4,>=2 (from requests->pywhispercpp)
Downloading charset_normalizer-3.4.0-cp311-cp311-musllinux_1_2_aarch64.whl.metadata (34 kB)
Collecting idna<4,>=2.5 (from requests->pywhispercpp)
Downloading idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests->pywhispercpp)
Downloading urllib3-2.2.3-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests->pywhispercpp)
Downloading certifi-2024.8.30-py3-none-any.whl.metadata (2.2 kB)
Downloading numpy-2.1.2-cp311-cp311-musllinux_1_2_aarch64.whl (14.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.4/14.4 MB 58.7 MB/s eta 0:00:00
Downloading platformdirs-4.3.6-py3-none-any.whl (18 kB)
Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 20.3 MB/s eta 0:00:00
Downloading tqdm-4.66.5-py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.4/78.4 kB 25.2 MB/s eta 0:00:00
Downloading certifi-2024.8.30-py3-none-any.whl (167 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.3/167.3 kB 44.8 MB/s eta 0:00:00
Downloading charset_normalizer-3.4.0-cp311-cp311-musllinux_1_2_aarch64.whl (139 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.1/139.1 kB 37.4 MB/s eta 0:00:00
Downloading idna-3.10-py3-none-any.whl (70 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 kB 22.0 MB/s eta 0:00:00
Downloading urllib3-2.2.3-py3-none-any.whl (126 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.3/126.3 kB 39.2 MB/s eta 0:00:00
Building wheels for collected packages: pywhispercpp
Building wheel for pywhispercpp (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for pywhispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [97 lines of output]
running bdist_wheel
it fails because it can't find ggml.h in the wheel source file.
Can we instruct the cibuild to skip musl platform all together for the time being? It will speed up CI. Building on alpine fails due to some other issues, but whisper.cpp compiles fine though.
I managed to build it on alpine, the issue was lack of build tools, or mess in the build directory from the previous builds (i'm sharing the same volume between dockers).
It builds on python:3.11-alpine, but fails on quay.io/pypa/musllinux_1_1_aarch64:2024 :(.
I’ve upgraded to the latest cibuild, and I hope that it will fix the issue since the newest cibuildwheel is now targeting musllinux 1_2 instead of the outdated 1_1. I also updated wheel.yml to use the latest version of the artifact commands, as the old ones will soon be phased out.
I've run it on my CI and found the same issue,but I had a closer look what docker images are being used and how to run them and I managed to isolate the issue.
Repairwheel ships with it's own version of auditwheel, that have hardcoded values for the AUDITWHEEL_PLAT variable, hence it fails on musllinux. I've reported that back to the repairwheel https://github.com/jvolkman/repairwheel/issues/43
I think the way forward is to run auditool directly on muslinux via cibuildwheel, and keep using repairwheel only during development when pip is being run without cibuildwheel.
It is working now, the cibuildwheel is taking care of repair step, for linux and macos. Wheelrepair is fixing dependencies on windows, and during regular build. I've learned the hard way that you can't repair wheel twice on linux as auditwheel repair changes the names of the libs.
I've found out how to get in to the failing build of cibuildwheel while it is running. The easy way is to add sleep to CIBW_BEFORE_BUILD, and then exec bash in to the container to inspect and run the build manually.
The test on linux are passing after cibuildwheel.
I haven't changed the gh actions/upload-artifact to version 4 as it fails complaining that we overwrite existing files during upload. So maybe it would make sense to add some code to change the version from 1.2 to a 1.3-
Let me know what you think. I will try this on windows later on.
@PiotrCzapla, I agree .. If we can make this work seamlessly across all platforms, it would be great. However, it's quite challenging to find the proper way to do it cross-platform.
So, first, I checked out your branch and tested the code on my Linux machine locally, but unfortunately, it doesn't work if I delete the build folder. It throws an ImportError: libwhisper.so.1: cannot open shared object file: No such file or directory
, which is the original issue.
I also tried checking the bdist_wheel
command, but it doesn't exist. Although the build process doesn’t fail, it's still odd. I attempted to update setuptools
with no success.
Regarding CI, take a look at the CI artifacts , Even though it finished without issues, I couldn't find any linux_x86_64 wheels. This wasn’t a problem when using the cbuildwheel action [check previous builds from GitHub Actions, for example]. You can also check the generated wheels. Taking manylinux as an example, I noticed the libs are now generated inside the pywhispercpp.libs folder with temporary names, I assume. Additionally, some extra shared libraries like libgoml are being included in the wheel (the same issue occurs with Windows wheels), which shouldn’t be the case. So, I doubt the Windows wheels will work either, but if you have access to a Windows machine, please give it a try. The macOS builds seem good, though.
I think we need to add a step to run the test suite after the build to ensure the builds are working properly.
And of course, once we finish this, feel free to submit any other PRs. Your contributions are always welcome :)
I've added tests to the workflow after the build, Please check out this branch.
the macos and linux builds seem great, though it still failing locally on my linux machine (My bad, I just noticed that the wheel for linux_x86_64 was actually generated, ) So maybe something wrong with my installation.
The only issue now is windows wheels! Take a look at the workflow results.
The Windows wheels were failing because of the manual loading mechanism. I deleted it and all tests are working now. Nevertheless, If you have access to a Windows machine please give it a try as well, If everything goes well I'll go ahead and merge this one.
Really brilliant contribution @PiotrCzapla :rocket:
Hi thank you for adding the tests and fixing the remaining issue, I have an app that uses pywhispercpp and should work on windows, I wasn't able to do so last week but I'm going to give it a try soon and report back. Thank you for providing the build files.
Btw. What would you say to bumping the version? If we include the commit check sum we will be able to change upload action to v4.
I think all builds are good .. I'll just go ahead and merge this one. I have one final question though, do you have any idea why the shared library names are good on macos wheels but not on Linux and windows ?
Btw. What would you say to bumping the version? If we include the commit check sum we will be able to change upload action to v4.
Are you talking about bumping the version of the upload Github action, right ? I don't think it has any value now since everything is working as expected, do you think it's worth it ?
@abdeladim-s thank you for merging. It is absolute hell using windows and python, but I managed to get it working on windows, after pip install . both test run fine, and I'm able to use my script too.
one remaining issue on windows is that it does not build from source when using pip install 'git+http ...' see: #78
I have one final question though, do you have any idea why the shared library names are good on macos wheels but not on Linux and windows ?
The values at the end of libs names looks like checksums, so I guess it is to ensure that a correct lib is being loaded, each time and not a lib from linux env, I wonder though why mac os does not have it though.
But I'm pretty sure it isn't an error as Torch has the same:
unzip -t torch-2.5.0-cp312-cp312-manylinux2014_aarch64.whl |grep -v -E '\.py|\.h|\.cmake|\.cpp\|.cuh' | grep -v '/ '
Archive: torch-2.5.0-cp312-cp312-manylinux2014_aarch64.whl
testing: torch.libs/libgfortran-daac5196.so.5.0.0 OK
testing: torch.libs/libopenblasp-r0-2e4c6a40.3.25.so OK
testing: torch.libs/libgomp-ee2ec4d2.so.1.0.0 OK
testing: torch.libs/libarm_compute_graph-742b7619.so OK
testing: torch.libs/libarm_compute-32c725fd.so OK
testing: torch-2.5.0.dist-info/METADATA OK
Are you talking about bumping the version of the upload Github action, right ? I don't think it has any value now since everything is working as expected, do you think it's worth it ?
I was referring to both: if we upgrade the version of pywhisercpp and include the commit ID, we can raise the update-artifacts to v4. This version prevents file overwriting during the build process, likely across different builds, so having different wheels names each build should solve the issue.
@abdeladim-s thank you for merging. It is absolute hell using windows and python, but I managed to get it working on windows, after pip install . both test run fine, and I'm able to use my script too. one remaining issue on windows is that it does not build from source when using pip install 'git+http ...' see: #78
Yes, windows is absolute hell, especially if you wanted to build a python C extension .. glad you made it through and it's working :) Thanks!
I have one final question though, do you have any idea why the shared library names are good on macos wheels but not on Linux and windows ?
The values at the end of libs names looks like checksums, so I guess it is to ensure that a correct lib is being loaded, each time and not a lib from linux env, I wonder though why mac os does not have it though.
But I'm pretty sure it isn't an error as Torch has the same:
unzip -t torch-2.5.0-cp312-cp312-manylinux2014_aarch64.whl |grep -v -E '\.py|\.h|\.cmake|\.cpp\|.cuh' | grep -v '/ ' Archive: torch-2.5.0-cp312-cp312-manylinux2014_aarch64.whl testing: torch.libs/libgfortran-daac5196.so.5.0.0 OK testing: torch.libs/libopenblasp-r0-2e4c6a40.3.25.so OK testing: torch.libs/libgomp-ee2ec4d2.so.1.0.0 OK testing: torch.libs/libarm_compute_graph-742b7619.so OK testing: torch.libs/libarm_compute-32c725fd.so OK testing: torch-2.5.0.dist-info/METADATA OK
Oh, so it's the checksum. That makes sense. Thanks for the clarification.
Are you talking about bumping the version of the upload Github action, right ? I don't think it has any value now since everything is working as expected, do you think it's worth it ?
I was referring to both: if we upgrade the version of pywhisercpp and include the commit ID, we can raise the update-artifacts to v4. This version prevents file overwriting during the build process, likely across different builds, so having different wheels names each build should solve the issue.
Yes, for pywhisercpp I was planning to bump a new version and push it to PYPI, especially the wheels are working now. I am just still trying to build wheels for other architectures like arm and aarch64, the build process takes so long and always fail with weird errors.
I'm still unsure if this is the best approach but I wanted
pip install .
to work, normally there is an option in cibuildwheel that let us run repairwheel, but this leaves the pip install broken.It is quite difficult to find information how to do it properly, so I'm still unsure that it is right solution as we are still ending up with libs in two places in the wheel, but at least they reference them selfs correctly so which ever is picked first the rest is loaded.
I've tested this only on my mac so I'm not sure that it is working correctly in windows and linux but it should as repairwheel is multiplatform.
After repair the libs are in pywhispercpp-1.2.0/.dylib and in pywhispercpp-1.2.0/pywhispercpp/.dylibs/.dylib, but they reference each other have a look at the output of otool -L
This pr is based on #74, as I'm assuming it won't be merged before it, as it needs test. Only the last commit matters. I've tested this on macos, I'm not sure if it will work on other platforms though. Could you check on yours?