[ ] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
I expect llama-cpp-python can normally run on Intel GPUs as llama.cpp do.
Current Behavior
llama-cpp-python fail to run on Intel GPUs while llama.cpp sycl backend can run normally.
Environment and Context
I test SYCL support on Intel Arc A770 GPU, with ubuntu 22.04 system, oneapi version is 2024.0 . I have verified llama.cpp sycl backend works normally on my machine.
from llama_cpp import Llama
llm = Llama(
model_path="~/llama.cpp/models/7B/ggml-model-q4_0-pure.gguf",
n_gpu_layers=33, # Uncomment to use GPU acceleration
seed=1337, # Uncomment to set a specific seed
# n_ctx=2048, # Uncomment to increase the context window
)
output = llm(
"Q: Name the planets in the solar system? A: ", # Prompt
max_tokens=32, # Generate up to 32 tokens, set to None to generate up to the end of the context window
stop=["Q:", "\n"], # Stop generating just before the model would generate a new question
echo=True # Echo the prompt back in the output
) # Generate a completion, can also call create_completion
print(output)
Failure Logs
ggml_init_sycl: GGML_SYCL_DEBUG: 0
ggml_init_sycl: GGML_SYCL_F16: no
found 2 SYCL devices:
|ID| Name |compute capability|Max compute units|Max work group|Max sub group|Global mem size|
|--|---------------------------------------------|------------------|-----------------|--------------|-------------|---------------|
| 0| 13th Gen Intel(R) Core(TM) i9-13900K| 3.0| 32| 8192| 64| 67181625344|
| 1| Intel(R) FPGA Emulation Device| 1.2| 32| 67108864| 64| 67181625344|
DeviceList is empty. -30 (PI_ERROR_INVALID_VALUE)Exception caught at file:/tmp/pip-install-31terybs/llama-cpp-python_2e42ff812a094f19b998956fddc30615/vendor/llama.cpp/ggml-sycl.cpp, line:13341
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I expect llama-cpp-python can normally run on Intel GPUs as llama.cpp do.
Current Behavior
llama-cpp-python fail to run on Intel GPUs while llama.cpp sycl backend can run normally.
Environment and Context
I test SYCL support on Intel Arc A770 GPU, with ubuntu 22.04 system, oneapi version is 2024.0 . I have verified llama.cpp sycl backend works normally on my machine.
Steps to Reproduce
while test.py is
Failure Logs