intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.76k stars 1.27k forks source link

Kernel NULL pointer dereference in i915 driver #12435

Open luhuaei opened 4 days ago

luhuaei commented 4 days ago

A kernel NULL pointer dereference has been observed in the i915 driver, causing a kernel oops. Details are as follows:

Error message: Nov 24 09:15:40 jammy kernel: BUG: kernel NULL pointer dereference, address: 00000000000000c8 Nov 24 09:15:40 jammy kernel: #PF: supervisor read access in kernel mode Nov 24 09:15:40 jammy kernel: #PF: error_code(0x0000) - not-present page

call stack: 2024-11-23-2024-11-25.txt

cpu info:

vendor_id       : GenuineIntel
cpu family      : 6
model           : 140
model name      : 11th Gen Intel(R) Core(TM) i5-1155G7 @ 2.50GHz
stepping        : 2
microcode       : 0x38
cpu MHz         : 400.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 27
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l2 invpcid_single cdp_l2 ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear ibt flush_l1d arch_capabilities
vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple pml ept_mode_based_exec tsc_scaling
bugs            : apic_c1e spectre_v1 spectre_v2 spec_store_bypass swapgs eibrs_pbrsb gds bhi
bogomips        : 4992.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

uname -a:

Linux jammy 6.5.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May  7 09:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

os release:

PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

dpkg --list dpkg-list.txt

free -h

free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       810Mi       6.3Gi       1.0Mi       8.3Gi        14Gi
Swap:          4.0Gi       0.0Ki       4.0Gi

conda list

_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
accelerate                0.23.0                   pypi_0    pypi
aiofiles                  23.2.1                   pypi_0    pypi
aiohappyeyeballs          2.4.3                    pypi_0    pypi
aiohttp                   3.11.7                   pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
annotated-types           0.7.0                    pypi_0    pypi
anyio                     4.6.2.post1              pypi_0    pypi
arpeggio                  2.0.2                    pypi_0    pypi
attrs                     24.2.0                   pypi_0    pypi
bigdl-core-xe-21          2.1.0b2                  pypi_0    pypi
bigdl-core-xe-addons-21   2.1.0b2                  pypi_0    pypi
bigdl-core-xe-batch-21    2.1.0b2                  pypi_0    pypi
bzip2                     1.0.8                h4bc722e_7    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
caliper-reader            0.4.1                    pypi_0    pypi
certifi                   2024.8.30                pypi_0    pypi
charset-normalizer        3.4.0                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
cloudpickle               3.1.0                    pypi_0    pypi
cmake                     3.31.1                   pypi_0    pypi
contourpy                 1.3.1                    pypi_0    pypi
cycler                    0.12.1                   pypi_0    pypi
datasets                  3.1.0                    pypi_0    pypi
dill                      0.3.8                    pypi_0    pypi
diskcache                 5.6.3                    pypi_0    pypi
distro                    1.9.0                    pypi_0    pypi
fastapi                   0.112.4                  pypi_0    pypi
ffmpy                     0.4.0                    pypi_0    pypi
filelock                  3.16.1                   pypi_0    pypi
fonttools                 4.55.0                   pypi_0    pypi
frozenlist                1.5.0                    pypi_0    pypi
fsspec                    2024.9.0                 pypi_0    pypi
gradio                    4.43.0                   pypi_0    pypi
gradio-client             1.3.0                    pypi_0    pypi
h11                       0.14.0                   pypi_0    pypi
httpcore                  1.0.7                    pypi_0    pypi
httptools                 0.6.4                    pypi_0    pypi
httpx                     0.27.2                   pypi_0    pypi
huggingface-hub           0.26.2                   pypi_0    pypi
idna                      3.10                     pypi_0    pypi
importlib-resources       6.4.5                    pypi_0    pypi
intel-cmplr-lib-ur        2025.0.2                 pypi_0    pypi
intel-extension-for-pytorch 2.1.10+xpu               pypi_0    pypi
intel-openmp              2025.0.2                 pypi_0    pypi
interegular               0.3.3                    pypi_0    pypi
ipex-llm                  2.1.0b2                  pypi_0    pypi
jinja2                    3.1.4                    pypi_0    pypi
jiter                     0.7.1                    pypi_0    pypi
jsonschema                4.23.0                   pypi_0    pypi
jsonschema-specifications 2024.10.1                pypi_0    pypi
kiwisolver                1.4.7                    pypi_0    pypi
lark                      1.2.2                    pypi_0    pypi
ld_impl_linux-64          2.43                 h712a8e2_2    conda-forge
libexpat                  2.6.4                h5888daf_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    14.2.0               h77fa898_1    conda-forge
libgcc-ng                 14.2.0               h69a702a_1    conda-forge
libgomp                   14.2.0               h77fa898_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libsqlite                 3.47.0               hadc24fc_1    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.3.1                hb9d3cd8_2    conda-forge
llnl-hatchet              2024.1.3                 pypi_0    pypi
llvmlite                  0.43.0                   pypi_0    pypi
lm-format-enforcer        0.10.3                   pypi_0    pypi
markdown-it-py            3.0.0                    pypi_0    pypi
markupsafe                2.1.5                    pypi_0    pypi
matplotlib                3.9.2                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mpi4py                    4.0.1                    pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
msgpack                   1.1.0                    pypi_0    pypi
multidict                 6.1.0                    pypi_0    pypi
multiprocess              0.70.16                  pypi_0    pypi
ncurses                   6.5                  he02047a_1    conda-forge
nest-asyncio              1.6.0                    pypi_0    pypi
networkx                  3.4.2                    pypi_0    pypi
ninja                     1.11.1.2                 pypi_0    pypi
numba                     0.60.0                   pypi_0    pypi
numpy                     1.26.4                   pypi_0    pypi
oneccl-bind-pt            2.1.300+xpu              pypi_0    pypi
openai                    1.55.0                   pypi_0    pypi
openssl                   3.4.0                hb9d3cd8_0    conda-forge
orjson                    3.10.12                  pypi_0    pypi
outlines                  0.0.46                   pypi_0    pypi
packaging                 24.2                     pypi_0    pypi
pandas                    2.2.3                    pypi_0    pypi
pillow                    10.4.0                   pypi_0    pypi
pip                       24.3.1             pyh8b19718_0    conda-forge
prometheus-client         0.21.0                   pypi_0    pypi
prometheus-fastapi-instrumentator 7.0.0                    pypi_0    pypi
propcache                 0.2.0                    pypi_0    pypi
protobuf                  5.29.0rc3                pypi_0    pypi
psutil                    6.1.0                    pypi_0    pypi
py-cpuinfo                9.0.0                    pypi_0    pypi
pyairports                2.1.1                    pypi_0    pypi
pyarrow                   18.0.0                   pypi_0    pypi
pycountry                 24.6.1                   pypi_0    pypi
pydantic                  2.10.1                   pypi_0    pypi
pydantic-core             2.27.1                   pypi_0    pypi
pydot                     3.0.2                    pypi_0    pypi
pydub                     0.25.1                   pypi_0    pypi
pygments                  2.18.0                   pypi_0    pypi
pyparsing                 3.2.0                    pypi_0    pypi
python                    3.11.10         hc5c86c4_3_cpython    conda-forge
python-dateutil           2.9.0.post0              pypi_0    pypi
python-dotenv             1.0.1                    pypi_0    pypi
python-multipart          0.0.17                   pypi_0    pypi
pytz                      2024.2                   pypi_0    pypi
pyyaml                    6.0.2                    pypi_0    pypi
pyzmq                     26.2.0                   pypi_0    pypi
ray                       2.39.0                   pypi_0    pypi
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1                   pypi_0    pypi
regex                     2024.11.6                pypi_0    pypi
requests                  2.32.3                   pypi_0    pypi
rich                      13.9.4                   pypi_0    pypi
rpds-py                   0.21.0                   pypi_0    pypi
ruff                      0.8.0                    pypi_0    pypi
safetensors               0.4.6.dev0               pypi_0    pypi
semantic-version          2.10.0                   pypi_0    pypi
sentencepiece             0.2.0                    pypi_0    pypi
setuptools                69.5.1                   pypi_0    pypi
shellingham               1.5.4                    pypi_0    pypi
six                       1.16.0                   pypi_0    pypi
sniffio                   1.3.1                    pypi_0    pypi
starlette                 0.38.6                   pypi_0    pypi
sympy                     1.13.3                   pypi_0    pypi
tabulate                  0.9.0                    pypi_0    pypi
tcmlib                    1.2.0                    pypi_0    pypi
textx                     4.1.0                    pypi_0    pypi
tiktoken                  0.8.0                    pypi_0    pypi
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tokenizers                0.20.3                   pypi_0    pypi
tomlkit                   0.12.0                   pypi_0    pypi
torch                     2.1.0a0+cxx11.abi          pypi_0    pypi
torchvision               0.16.0a0+cxx11.abi          pypi_0    pypi
tqdm                      4.67.0                   pypi_0    pypi
transformers              4.46.3                   pypi_0    pypi
triton-xpu                3.0.0b2                  pypi_0    pypi
typer                     0.13.1                   pypi_0    pypi
typing-extensions         4.12.2                   pypi_0    pypi
tzdata                    2024.2                   pypi_0    pypi
umf                       0.9.1                    pypi_0    pypi
urllib3                   2.2.3                    pypi_0    pypi
uvicorn                   0.32.1                   pypi_0    pypi
uvloop                    0.21.0                   pypi_0    pypi
vllm                      0.5.4+xpu                pypi_0    pypi
watchfiles                0.24.0                   pypi_0    pypi
websockets                12.0                     pypi_0    pypi
wheel                     0.45.1             pyhd8ed1ab_0    conda-forge
xxhash                    3.5.0                    pypi_0    pypi
xz                        5.2.6                h166bdaf_0    conda-forge
yarl                      1.18.0                   pypi_0    pypi

oneAPI env:

source /opt/intel/oneapi/setvars.sh
export SYCL_CACHE_PERSISTENT=1

:: initializing oneAPI environment ...
   -bash: BASH_VERSION = 5.1.16(1)-release
   args: Using "$@" for setvars.sh arguments: 
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: oneAPI environment initialized ::
qiuxin2012 commented 4 days ago

Could give me your output of clinfo | grep "Device Name"? See https://github.com/intel-analytics/ipex-llm/blob/0e23bd779f043145710f46b400555a3beff07a04/docs/mddocs/Quickstart/install_linux_gpu.md#5-configure-permmision-and-verify-gpu-driver-setup for details.

luhuaei commented 1 day ago

clinfo | grep "Device Name"

  Device Name                                     Intel(R) Iris(R) Xe Graphics
    Device Name                                   Intel(R) Iris(R) Xe Graphics
    Device Name                                   Intel(R) Iris(R) Xe Graphics
    Device Name                                   Intel(R) Iris(R) Xe Graphics

device info

Number of devices                                 1
  Device Name                                     Intel(R) Iris(R) Xe Graphics
  Device Vendor                                   Intel(R) Corporation
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 3.0 NEO 
  Device UUID                                     8680499a-0300-0000-0002-000000000000
  Driver UUID                                     32342e33-392e-3331-3239-340000000000
  Valid Device LUID                               No
  Device LUID                                     d02a-45d8fe7f0000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  24.39.31294
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C all versions                    OpenCL C                                                         0x400000 (1.0.0)