intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.76k stars 1.27k forks source link

Cannot find dGPU when install ollama on Windows #11340

Open YunLiu1 opened 5 months ago

YunLiu1 commented 5 months ago

When "pip install ipex-llm[cpp]", then "init-ollama.bat", it runs on CPU: " ... msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="31.6 GiB" ... "

But when "pip install ipex-llm[xpu]", it can run on my A770 dGPU.

When install them both "pip install ipex-llm[cpp,xpu]", i got this error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. bigdl-core-cpp 2.5.0b20240616 requires torch==2.2.0, but you have torch 2.1.0a0+cxx11.abi which is incompatible.

sgwhat commented 5 months ago

Hi @YunLiu1,

  1. msg="inference compute" id=0 library=cpu is a confusing and useless runtime log, and it does not mean that ollama is running on CPU. To ensure that it's running on the dGPU, you may follow the steps below:
    • Check the output from the ollama server. When running successfully on the dGPU, ollama will produce output similar to the sample output.
    • Check the memory usage of your dGPU during running model inference.
  2. To run ipex-llm ollama on your dGPU, you only need to install ipex-llm[cpp]. For more details, please see our ollama document.
samamiller commented 1 month ago

To run ipex-llm ollama on your dGPU, you only need to install ipex-llm[cpp]

Is there documentation that explains these options? What does installing ipex-llm[xpu] do that ipex-llm[cpp] doesn't? Why do all the examples in the quickstart folder seem to all use different options and versions none of them seem to be compatible?

sgwhat commented 1 month ago

Is there documentation that explains these options? What does installing ipex-llm[xpu] do that ipex-llm[cpp] doesn't? Why do all the examples in the quickstart folder seem to all use different options and versions none of them seem to be compatible?

Hi @samamiller, you may see our official document for more details.

jason-dai commented 1 month ago

To run ipex-llm ollama on your dGPU, you only need to install ipex-llm[cpp]

Is there documentation that explains these options? What does installing ipex-llm[xpu] do that ipex-llm[cpp] doesn't? Why do all the examples in the quickstart folder seem to all use different options and versions none of them seem to be compatible?

@samamiller It depends on your use cases (e.g., llama.cpp vs. pytorch vs. llm); see https://github.com/intel-analytics/ipex-llm#use