abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.71k stars 928 forks source link

igpu #1709

Open ayttop opened 2 weeks ago

ayttop commented 2 weeks ago

from llama_cpp import Llama

llm = Llama( model_path="C:\Users\ArabTech\Desktop\4\phi-3.5-mini-instruct-q4_k_m.gguf", n_gpu_layers=-1, verbose=True, ) output = llm( "Q: Who is Napoleon Bonaparte A: ", max_tokens=1024, stop=["\n"] # Add a stop sequence to end generation at a newline ) print(output)

n_gpu_layers=-1 n_gpu_layers=32

not work on igpu intel

how ofload model on igpu intel?

abetlen commented 2 weeks ago

@ayttop maybe someone else knows better but for integrated graphics compiling for the Vulkan bakend may be your only option, though it may not be faster than a CPU installation.