xpu devices not found insider container for host kernel > 6.8

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Apache License 2.0

6.48k stars 1.24k forks source link

Folks,

If you are not aware, there is an issue with old compute runtime drivers (the one that is in the prebuilt ipex-llm containers not seeing the xpu devices (sycl-ls and torch.xpu.is_available will not show the gpu devices.) for host sytems with kernel > 6.8. Fix for this is to update the compute runtime inside the container or adding these two env vars to the container:

export NEOReadDebugKeys=1
export OverrideGpuAddressSpace=48

I would recommend upgrading the compute runtime drivers inside the container. Spend few hours trying to fix this issue on latest Ubuntu 24.04 as well as an arch system with 6.9.3 kernel.

Here is the issue at compute_runtime github: https://github.com/intel/compute-runtime/issues/710#issuecomment-2002646557

intel-analytics / ipex-llm

xpu devices not found insider container for host kernel > 6.8 #11594