Closed ChristianWeyer closed 1 month ago
BTW: I installed with:
CMAKE_ARGS="-DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_APPLE_SILICON_PROCESSOR=arm64 -DLLAMA_METAL=on" pip install --upgrade --verbose --force-reinstall --no-cache-dir llama-cpp-python
Got it:
--n_gpu_layers -1
When running
python -m empower_functions.server --model ggml-model-f16.gguf --chat_format empower-functions
I see that the GPU is not used.
Do we need an extra argument to run with Metal? llama-cpp-python