Open gaby opened 6 months ago
Hey @gaby thank you I'll add support for that soon, do you mind giving me a hand testing when the PR is ready?
Wrt the cpu wheels, I'm conflicted because it does really blow up the power set of builds that have to be run for each release. My thought process for the wheels is to build something that works okay for most people but if you want it to run quickly you should build from source.
My current position is that I'm willing to expand the number of builds if we also implement some optimizations each time to mitigate the combinatorial explosion.
Some thoughts I have for long term solutions
Yeah, I can definitely test the arm64/linux wheels on a Raspberry PI.
I was using the wheels from https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels for the longest, but yes it does blow out the number of CI builds/job that get created. I do like your idea of having ggml check the CPU flags to determine what to/not use.
@gaby sorry I was just re-reviewing this, so currently is it that the wheels that end in _arm64.whl
don't work inside of docker on MacOS and we should replace them with the wheels built using the cibuildwheel process from your repo?
@abetlen If you install the package in MacOS
directly the platform is darwin/arm64
, which you have wheels for already. If you install the package in MacOS with Docker the platform inside the container is linux/arm64
. This is due to MacOS
using QEMU when running containers.
The linux/arm64 platform would also benefit users of Raspberry Pi, specially Pi4/Pi5.
@gaby thank you, I've added your provided code to the release workflow for python versions 3.8-3.12, can you let me know if it works correctly?
@abetlen I don't see any arm64
wheels here https://abetlen.github.io/llama-cpp-python/whl/cpu/llama-cpp-python/
Running pip install confirms it can't find wheels:
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [24 lines of output]
*** scikit-build-core 0.9.2 using CMake 3.29.2 (wheel)
*** Configuring CMake...
loading initial cache file /tmp/tmpwa68rmyp/build/CMakeInit.txt
-- The C compiler identification is unknown
-- The CXX compiler identification is unknown
CMake Error at CMakeLists.txt:3 (project):
No CMAKE_C_COMPILER could be found.
Tell CMake where to find the compiler by setting either the environment
variable "CC" or the CMake cache entry CMAKE_C_COMPILER to the full path to
the compiler, or to the compiler name if it is in the PATH.
CMake Error at CMakeLists.txt:3 (project):
No CMAKE_CXX_COMPILER could be found.
Tell CMake where to find the compiler by setting either the environment
variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
to the compiler, or to the compiler name if it is in the PATH.
-- Configuring incomplete, errors occurred!
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: pip install --upgrade pip
root@eb7b81f37400:/# uname -m
aarch64
I think it's related to this line in the CI https://github.com/abetlen/llama-cpp-python/blob/main/.github/workflows/build-and-release.yaml#L70
@gaby looks like they were built and uploaded as artifacts but not added to the release(?)
I'll take a look later but this is the last workflow run file if you can spot what I'm doing wrong.
@gaby looks like they were built and uploaded as artifacts but not added to the release(?)
I'll take a look later but this is the last workflow run file if you can spot what I'm doing wrong.
@gaby @abetlen Fixed here: https://github.com/abetlen/llama-cpp-python/pull/1392/files
@Smartappli wow thank you so much!
Would be nice to have arm64 updated builds, as the last conda package has no support for many model types
@abetlen Thank you for the new efforts to start publishing wheels for CUDA, etc.
I noticed that the METAL wheels only work for darwin platform, when using Docker in MacOS the platform is arm64/linux not darwin.
I have a repo where I was building arm64/wheels that could probably be integrated into your workflows: https://github.com/gaby/arm64-wheels
TLDR
This would need to be expanded to support other Python versions/Pypy.
I also notice the CPU wheels don't have specifics about AVX, AVX2, AVX512 are there plans to add support for those?