intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.43k stars 1.24k forks source link

Error running llama.cpp with IPEX-LLM on MTL iGPU following quickstart guide (Native API returns: -30 (PI_ERROR_INVALID_VALUE)) #11278

Open OvaltineSamuel opened 2 months ago

OvaltineSamuel commented 2 months ago

Error Description

I am encountering the error, Native API returns: -30 (PI_ERROR_INVALID_VALUE), when trying to run llama.cpp with the latest IPEX-LLM, following the official quickstart guide on the IPEX-LLM website: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/llama_cpp_quickstart.html

Error Output Log

Log start
main: build = 1 (1e71e4c)
main: built with IntelLLVM 2024.0.2 for
main: seed  = 1717836796
llama_model_loader: loaded meta data with 24 key-value pairs and 291 tensors from mistral-7b-instruct-v0.2.Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.2
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  11:                          general.file_type u32              = 15
llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,32000]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  14:                      tokenizer.ggml.scores arr[f32,32000]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  15:                  tokenizer.ggml.token_type arr[i32,32000]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv  16:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  17:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  18:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  19:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - kv  20:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  21:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  22:                    tokenizer.chat_template str              = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv  23:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
llm_load_vocab: special tokens definition check successful ( 259/32000 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 32000
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 32768
llm_load_print_meta: n_embd           = 4096
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_layer          = 32
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 4
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 14336
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 1000000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 32768
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = 7B
llm_load_print_meta: model ftype      = Q4_K - Medium
llm_load_print_meta: model params     = 7.24 B
llm_load_print_meta: model size       = 4.07 GiB (4.83 BPW)
llm_load_print_meta: general.name     = mistralai_mistral-7b-instruct-v0.2
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: PAD token        = 0 '<unk>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
[SYCL] call ggml_init_sycl
ggml_init_sycl: GGML_SYCL_DEBUG: 0
ggml_init_sycl: GGML_SYCL_F16: no
Native API failed. Native API returns: -30 (PI_ERROR_INVALID_VALUE) -30 (PI_ERROR_INVALID_VALUE)
Exception caught at file:C:/Users/Administrator/actions-runner/cpp-release/_work/llm.cpp/llm.cpp/llama-cpp-bigdl/ggml-sycl.cpp, line:13106, func:operator()
ggml_backend_sycl_set_mul_device_mode: true
Native API failed. Native API returns: -30 (PI_ERROR_INVALID_VALUE) -30 (PI_ERROR_INVALID_VALUE)Exception caught at file:C:/Users/Administrator/actions-runner/cpp-release/_work/llm.cpp/llm.cpp/llama-cpp-bigdl/ggml-sycl.cpp, line:3289

Steps for Error Reproduce

  1. Follow the quickstart guide: create and activate environment and install ipex-llm[cpp] packages.

    conda create -n llm-cpp python=3.11
    conda activate llm-cpp
    pip install --pre --upgrade ipex-llm[cpp]
  2. Set up folder and run init-llama-cpp.bat in administrator in Miniforge Prompt.

    mkdir llama-cpp
    cd llama-cpp
    init-llama-cpp.bat
  3. Runtime Configuration for Windows.

    set SYCL_CACHE_PERSISTENT=1
  4. Run the command below for LLM inference with llama.cpp.

    main -m mistral-7b-instruct-v0.1.Q4_K_M.gguf -n 32 --prompt "Once upon a time, there existed a little girl who liked to have adventures. She wanted to go to places and meet new people, and have fun" -t 8 -e -ngl 33 --color

Environment Information

Collecting environment information...
PyTorch version: 2.2.0+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:27:10) [MSC v.1938 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22631-SP0
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=2280
DeviceID=CPU0
Family=1
L2CacheSize=18432
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=3800
Name=Intel(R) Core(TM) Ultra 7 155H
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.2.0
[conda] mkl                       2024.0.0                 pypi_0    pypi
[conda] mkl-dpcpp                 2024.0.0                 pypi_0    pypi
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] onemkl-sycl-blas          2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-datafitting   2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-dft           2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-lapack        2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-rng           2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-sparse        2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-stats         2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-vm            2024.0.0                 pypi_0    pypi
[conda] torch                     2.2.0                    pypi_0    pypi
rnwang04 commented 2 months ago

Hi @OvaltineSamuel , A normal output log on MTL look likes:

llm_load_print_meta: BOS token        = 151643 '<|endoftext|>'
llm_load_print_meta: EOS token        = 151645 '<|im_end|>'
llm_load_print_meta: PAD token        = 151643 '<|endoftext|>'
llm_load_print_meta: LF token         = 148848 'ÄĬ'
llm_load_print_meta: EOT token        = 151645 '<|im_end|>'
[SYCL] call ggml_init_sycl
ggml_init_sycl: GGML_SYCL_DEBUG: 0
ggml_init_sycl: GGML_SYCL_F16: no
found 4 SYCL devices:
|  |                   |                                       |       |Max    |        |Max  |Global |                     |
|  |                   |                                       |       |compute|Max work|sub  |mem    |                     |
|ID|        Device Type|                                   Name|Version|units  |group   |group|size   |       Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]|                     Intel Arc Graphics|    1.3|    112|    1024|   32| 15482M|            1.3.29283|
| 1|     [opencl:gpu:0]|                     Intel Arc Graphics|    3.0|    112|    1024|   32| 15482M|        31.0.101.5534|
| 2|     [opencl:cpu:0]|                Intel Core Ultra 5 125H|    3.0|     18|    8192|   64| 33945M|2023.16.12.0.12_195853.xmain-hotfix|
| 3|     [opencl:acc:0]|            Intel FPGA Emulation Device|    1.2|     18|67108864|   64| 33945M|2023.16.12.0.12_195853.xmain-hotfix|
ggml_backend_sycl_set_mul_device_mode: true
detect 1 SYCL GPUs: [0] with top Max compute units:112
llm_load_tensors: ggml ctx size =    0.37 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU

It seems your program can't find sycl device and then raise this error. Could you please let us know your iGPU driver version ?

OvaltineSamuel commented 2 months ago

Yes, I'm using the latest Driver for MTL iGPU

image

jason-dai commented 2 months ago

Please run the ENV-Check script in https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/scripts

rnwang04 commented 2 months ago

Hi @OvaltineSamuel I have verified that on our local MTL, this driver version works: image

Could you please provide us more env details with https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/scripts and also show us your output of ls-sycl-device ?

OvaltineSamuel commented 2 months ago

So do I need to install OneAPI basetoolkit for using ipex-llm in this case?

rnwang04 commented 2 months ago

So do I need to install OneAPI basetoolkit for using ipex-llm in this case?

You don't need to install OneAPI basetoolkit by yourself. When you pip install --pre --upgrade ipex-llm[cpp], oneapi 2024.0 has already been installed in your llm-cpp env, please make sure always run your program in llm-cpp env.

JJJohnathan commented 2 months ago

And I encounter problem like this too, when I follow the guide and run sycl-ls here https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/install_linux_gpu.html

image

the result of running ls-sycl-device is :"ls-sycl-device: command not found" the results of running " ENV-Check script" are as follow bash env-check.sh

PYTHON_VERSION=3.11.9

transformers=4.36.2

torch=2.1.0a0+cxx11.abi

ipex-llm Version: 2.1.0b20240611

IPEX is not installed.

CPU Information: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 22 On-line CPU(s) list: 0-21 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) Ultra 9 185H CPU family: 6 Model: 170 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 4 CPU max MHz: 5000.0000 CPU min MHz: 400.0000 BogoMIPS: 6144.00

Total CPU Memory: 30.8876 GB Memory Type: LPDDR5

Operating System: Ubuntu 22.04.4 LTS \n \l


Linux odt-huyuan-penvino-ci-77 6.7.1-060701-generic #202401201133 SMP PREEMPT_DYNAMIC Sat Jan 20 11:43:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

CLI: Version: 1.2.35.20240425 Build ID: 00000000

Service: Version: 1.2.35.20240425 Build ID: 00000000 Level Zero Version: 1

Driver Version 2023.16.12.0.12_195853.xmain-hotfix Driver Version 2023.16.12.0.12_195853.xmain-hotfix Driver UUID 32342e30-392e-3238-3731-372e31320000 Driver Version 24.09.28717.12

Driver related package version: ii intel-fw-gpu 2024.17.5-329~22.04 all Firmware package for Intel integrated and discrete GPUs ii intel-i915-dkms 1.24.2.17.240301.20+i29-1 all Out of tree i915 driver. ii intel-level-zero-gpu 1.3.28717.12 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero.

SYCL Exception encountered: Native API failed. Native API returns: -30 (PI_ERROR_INVALID_VALUE) -30 (PI_ERROR_INVALID_VALUE)

igpu not detected

xpu-smi is properly installed.

+-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Name: Intel(R) Arc(TM) Graphics | | | Vendor Name: Intel(R) Corporation | | | SOC UUID: 00000000-0000-0200-0000-00087d558086 | | | PCI BDF Address: 0000:00:02.0 | | | DRM Device: /dev/dri/card0 | | | Function Type: physical | +-----------+--------------------------------------------------------------------------------------+ GPU0 Memory size=16M

00:02.0 VGA compatible controller: Intel Corporation Device 7d55 (rev 08) (prog-if 00 [VGA controller]) DeviceName: VGA Subsystem: ASUSTeK Computer Inc. Device 1a63 Flags: bus master, fast devsel, latency 0, IRQ 180 Memory at 5010000000 (64-bit, prefetchable) [size=16M] Memory at 4000000000 (64-bit, prefetchable) [size=256M] Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: Kernel driver in use: i915 Kernel modules: i915

rnwang04 commented 2 months ago

Hi @JJJohnathan We have not verified on kernel 6.7.1. Below is a sample output on our Linux MTL:

```bash (base) arda@xiaoxin04-ubuntu:~/ruonan/ipex-llm/python/llm/scripts$ bash env-check.sh ----------------------------------------------------------------- PYTHON_VERSION=3.10.14 ----------------------------------------------------------------- transformers=4.36.2 ----------------------------------------------------------------- torch=2.1.0a0+cxx11.abi ----------------------------------------------------------------- ipex-llm Version: 2.1.0b20240611 ----------------------------------------------------------------- ipex=2.1.10+xpu ----------------------------------------------------------------- CPU Information: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 18 On-line CPU(s) list: 0-17 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) Ultra 5 125H CPU family: 6 Model: 170 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 1 Stepping: 4 CPU max MHz: 4500.0000 CPU min MHz: 400.0000 BogoMIPS: 5990.40 ----------------------------------------------------------------- Total CPU Memory: 30.9502 GB ----------------------------------------------------------------- Operating System: Ubuntu 22.04.3 LTS \n \l ----------------------------------------------------------------- Linux xiaoxin04-ubuntu 6.5.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May 7 09:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux ----------------------------------------------------------------- CLI: Version: 1.2.22.20231126 Build ID: 00000000 Service: Version: 1.2.22.20231126 Build ID: 00000000 Level Zero Version: 1.14.0 ----------------------------------------------------------------- Driver Version 2023.16.12.0.12_195853.xmain-hotfix Driver Version 2023.16.12.0.12_195853.xmain-hotfix Driver Version 2024.17.3.0.08_160000 Driver UUID 32342e30-392e-3238-3731-372e31320000 Driver Version 24.09.28717.12 Driver Version 2024.17.3.0.08_160000 ----------------------------------------------------------------- Driver related package version: ii intel-fw-gpu 2023.39.2-255~22.04 all Firmware package for Intel integrated and discrete GPUs ii intel-level-zero-gpu 1.3.28717.12 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero. ii level-zero-dev 1.14.0-744~22.04 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero. ----------------------------------------------------------------- igpu detected [opencl:gpu:3] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.09.28717.12] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717] ----------------------------------------------------------------- xpu-smi is properly installed. ----------------------------------------------------------------- +-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Name: Intel(R) Arc(TM) Graphics | | | Vendor Name: Intel(R) Corporation | | | SOC UUID: 00000000-0000-0200-0000-00087d558086 | | | PCI BDF Address: 0000:00:02.0 | | | DRM Device: /dev/dri/card0 | | | Function Type: physical | +-----------+--------------------------------------------------------------------------------------+ GPU0 Memory ize=256M ----------------------------------------------------------------- 00:02.0 VGA compatible controller: Intel Corporation Device 7d55 (rev 08) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 3cc9 Flags: bus master, fast devsel, latency 0, IRQ 184, IOMMU group 0 Memory at 408c000000 (64-bit, prefetchable) [size=16M] Memory at 4000000000 (64-bit, prefetchable) [size=256M] Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: Kernel driver in use: i915 Kernel modules: i915 ----------------------------------------------------------------- (base) arda@xiaoxin04-ubuntu:~/ruonan/ipex-llm/python/llm/scripts$ sycl-ls [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.12.0.12_195853.xmain-hotfix] [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix] [opencl:cpu:2] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000] [opencl:gpu:3] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.09.28717.12] [opencl:acc:4] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717] ```

Maybe you can try with kernel 6.5 following https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/install_linux_gpu.html#for-linux-kernel-6-5

JJJohnathan commented 2 months ago

@rnwang04 I see. Anyway thx!

OvaltineSamuel commented 2 months ago

Please run the ENV-Check script in https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/scripts

Below is the output of env-check script

Python 3.11.9
-----------------------------------------------------------------
transformers=4.41.2
-----------------------------------------------------------------
torch=2.2.0+cpu
-----------------------------------------------------------------
Name: ipex-llm
Version: 2.1.0b20240607
Summary: Large Language Model Develop Toolkit
Home-page: https://github.com/intel-analytics/ipex-llm
Author: BigDL Authors
Author-email: bigdl-user-group@googlegroups.com
License: Apache License, Version 2.0
Location: C:\Users\Ovalt\miniforge3\envs\ipex-llm_ollama_v2\Lib\site-packages
Requires:
Required-by:
-----------------------------------------------------------------
IPEX is not installed properly.
-----------------------------------------------------------------
Total Memory: 15.376 GB

Chip 0 Memory: 16 GB | Speed: 5600 MHz
-----------------------------------------------------------------
CPU Manufacturer: GenuineIntel
CPU MaxClockSpeed: 3800
CPU Name: Intel(R) Core(TM) Ultra 7 155H
CPU NumberOfCores: 16
CPU NumberOfLogicalProcessors: 22
-----------------------------------------------------------------
GPU 0: Intel(R) Graphics         Driver Version:  31.0.101.5522
-----------------------------------------------------------------
-----------------------------------------------------------------
System Information

Host Name:                 EXPERTBOOK-B5
OS Name:                   Microsoft Windows 11 Pro
OS Version:                10.0.22631 N/A Build 22631
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Standalone Workstation
OS Build Type:             Multiprocessor Free
Registered Organization:   N/A
Product ID:                00355-61488-69042-AAOEM
Original Install Date:     4/30/2024, 6:54:59 AM
System Boot Time:          6/8/2024, 4:19:15 PM
System Manufacturer:       ASUSTeK COMPUTER INC.
System Model:              ASUS EXPERTBOOK B5404CMA_B5404CMA
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 170 Stepping 4 GenuineIntel ~2280 Mhz
BIOS Version:              ASUSTeK COMPUTER INC. (Licensed by AMI, LLC.) B5404CMA.208, 2/7/2024
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume1
System Locale:             en-us;English (United States)
Input Locale:              en-us;English (United States)
Time Zone:                 (UTC+08:00) Taipei
Total Physical Memory:     15,745 MB
Available Physical Memory: 6,476 MB
Virtual Memory: Max Size:  42,369 MB
Virtual Memory: Available: 29,067 MB
Virtual Memory: In Use:    13,302 MB
Page File Location(s):     C:\pagefile.sys
Domain:                    WORKGROUP
Logon Server:              \\EXPERTBOOK-B5
Hotfix(s):                 5 Hotfix(s) Installed.
                           [01]: KB5037591
                           [02]: KB5027397
                           [03]: KB5036212
                           [04]: KB5037853
                           [05]: KB5037959
Network Card(s):           3 NIC(s) Installed.
                           [01]: Intel(R) Wi-Fi 6E AX211 160MHz
                                 Connection Name: Wi-Fi
                                 DHCP Enabled:    Yes
                                 DHCP Server:     1.1.1.1
                                 IP address(es)
                                 [01]: 10.174.192.173
                                 [02]: fe80::bbb6:b406:183a:7a63
                           [02]: Intel(R) Ethernet Connection (18) I219-V
                                 Connection Name: Ethernet
                                 Status:          Media disconnected
                           [03]: Bluetooth Device (Personal Area Network)
                                 Connection Name: Bluetooth Network Connection
                                 Status:          Media disconnected
Hyper-V Requirements:      A hypervisor has been detected. Features required for Hyper-V will not be displayed.
-----------------------------------------------------------------
'xpu-smi' is not recognized as an internal or external command,
operable program or batch file.
xpu-smi is not installed properly.

It says IPEX is not installed properly. Also, I can't run ls-sycl-device without having OneAPI basetoolkit installed.

OvaltineSamuel commented 2 months ago

I tried restarting the laptop and reinstalling the whole conda environment again. Run the same env-check.bat with the same output of "IPEX is not installed properly".

rnwang04 commented 2 months ago

Here is a sample output of our Windows MTL:

```bash (ruonan-cpp) D:\ruonan\ipex-llm\python\llm\scripts>env-check.bat Python 3.11.9 ----------------------------------------------------------------- transformers=4.41.1 ----------------------------------------------------------------- torch=2.2.0+cpu ----------------------------------------------------------------- Name: ipex-llm Version: 2.1.0b20240610 Summary: Large Language Model Develop Toolkit Home-page: https://github.com/intel-analytics/ipex-llm Author: BigDL Authors Author-email: bigdl-user-group@googlegroups.com License: Apache License, Version 2.0 Location: C:\Users\arda\miniforge3\envs\ruonan-cpp\Lib\site-packages Requires: Required-by: ----------------------------------------------------------------- IPEX is not installed properly. ----------------------------------------------------------------- Total Memory: 31.615 GB Chip 0 Memory: 4 GB | Speed: 7467 MHz Chip 1 Memory: 4 GB | Speed: 7467 MHz Chip 2 Memory: 4 GB | Speed: 7467 MHz Chip 3 Memory: 4 GB | Speed: 7467 MHz Chip 4 Memory: 4 GB | Speed: 7467 MHz Chip 5 Memory: 4 GB | Speed: 7467 MHz Chip 6 Memory: 4 GB | Speed: 7467 MHz Chip 7 Memory: 4 GB | Speed: 7467 MHz ----------------------------------------------------------------- CPU Manufacturer: GenuineIntel CPU MaxClockSpeed: 3600 CPU Name: Intel(R) Core(TM) Ultra 5 125H CPU NumberOfCores: 14 CPU NumberOfLogicalProcessors: 18 ----------------------------------------------------------------- GPU 0: Intel(R) Arc(TM) Graphics Driver Version: 31.0.101.5534 ----------------------------------------------------------------- ----------------------------------------------------------------- System Information 主机名: XIAOXIN02 OS 名称: Microsoft Windows 11 家庭中文版 OS 版本: 10.0.22631 暂缺 Build 22631 OS 制造商: Microsoft Corporation OS 配置: 独立工作站 OS 构建类型: Multiprocessor Free 注册的所有人: arda 注册的组织: 暂缺 产品 ID: 00342-31548-11544-AAOEM 初始安装日期: 2023/12/25, 13:28:06 系统启动时间: 2024/6/12, 14:22:58 系统制造商: LENOVO 系统型号: 83D4 系统类型: x64-based PC 处理器: 安装了 1 个处理器。 [01]: Intel64 Family 6 Model 170 Stepping 4 GenuineIntel ~1200 Mhz BIOS 版本: LENOVO MECN40WW, 2023/10/25 Windows 目录: C:\Windows 系统目录: C:\Windows\system32 启动设备: \Device\HarddiskVolume1 系统区域设置: zh-cn;中文(中国) 输入法区域设置: en-us;英语(美国) 时区: (UTC+08:00) 北京,重庆,香港特别行政区,乌鲁木齐 物理内存总量: 32,373 MB 可用的物理内存: 24,125 MB 虚拟内存: 最大值: 43,939 MB 虚拟内存: 可用: 34,775 MB 虚拟内存: 使用中: 9,164 MB 页面文件位置: C:\pagefile.sys 域: WORKGROUP 登录服务器: \\XIAOXIN02 修补程序: 安装了 6 个修补程序。 [01]: KB5037591 [02]: KB5027397 [03]: KB5031274 [04]: KB5033055 [05]: KB5037771 [06]: KB5037663 网卡: 安装了 3 个 NIC。 [01]: Intel(R) Wi-Fi 6E AX211 160MHz 连接名: WLAN 状态: 媒体连接已中断 [02]: Bluetooth Device (Personal Area Network) 连接名: 蓝牙网络连接 状态: 媒体连接已中断 [03]: ASIX USB to Gigabit Ethernet Family Adapter 连接名: 以太网 3 启用 DHCP: 是 DHCP 服务器: 10.239.27.228 IP 地址 [01]: 10.239.158.142 [02]: fe80::4481:7641:add8:cc3 Hyper-V 要求: 虚拟机监视器模式扩展: 是 固件中已启用虚拟化: 是 二级地址转换: 是 数据执行保护可用: 是 ----------------------------------------------------------------- 'xpu-smi' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 xpu-smi is not installed properly. ```

You can ignore "IPEX is not installed properly" as ipex is not needed for running ipex-llm[cpp].

And ls-sycl-device doesn't need OneAPI toolkit, it's provided by init-llama-cpp.bat, you can just go to your cpp directory and run this ls-sycl-device.exe .

OvaltineSamuel commented 2 months ago

@rnwang04 Thanks for your clarification on that. However, I'm not getting anything when running ls-sycl-device.exe on my end.

image

rnwang04 commented 2 months ago

Hi @OvaltineSamuel On our Windows MTL machine, the output of ls-sycl-device.exe looks like:

(ruonan-cpp) D:\ruonan\bmk-llama-cpp>ls-sycl-device
found 4 SYCL devices:
|  |                   |                                       |       |Max    |        |Max  |Global |                     |
|  |                   |                                       |       |compute|Max work|sub  |mem    |                     |
|ID|        Device Type|                                   Name|Version|units  |group   |group|size   |       Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]|                     Intel Arc Graphics|    1.3|    112|    1024|   32| 15482M|            1.3.29283|
| 1|     [opencl:gpu:0]|                     Intel Arc Graphics|    3.0|    112|    1024|   32| 15482M|        31.0.101.5534|
| 2|     [opencl:cpu:0]|                Intel Core Ultra 5 125H|    3.0|     18|    8192|   64| 33945M|2023.16.12.0.12_195853.xmain-hotfix|
| 3|     [opencl:acc:0]|            Intel FPGA Emulation Device|    1.2|     18|67108864|   64| 33945M|2023.16.12.0.12_195853.xmain-hotfix|

and my pip list is :

```bash (ruonan-cpp) D:\ruonan\bmk-llama-cpp>pip list Package Version ----------------------- -------------- accelerate 0.21.0 bigdl-core-cpp 2.5.0b20240610 certifi 2024.2.2 charset-normalizer 3.3.2 colorama 0.4.6 dpcpp-cpp-rt 2024.0.2 filelock 3.14.0 fsspec 2024.5.0 gguf 0.6.0 huggingface-hub 0.23.1 idna 3.7 intel-cmplr-lib-rt 2024.0.2 intel-cmplr-lic-rt 2024.0.2 intel-opencl-rt 2024.0.2 intel-openmp 2024.0.2 ipex-llm 2.1.0b20240610 Jinja2 3.1.4 MarkupSafe 2.1.5 mkl 2024.0.0 mkl-dpcpp 2024.0.0 mpmath 1.3.0 networkx 3.3 numpy 1.26.4 onednn 2024.0.0 onemkl-sycl-blas 2024.0.0 onemkl-sycl-datafitting 2024.0.0 onemkl-sycl-dft 2024.0.0 onemkl-sycl-lapack 2024.0.0 onemkl-sycl-rng 2024.0.0 onemkl-sycl-sparse 2024.0.0 onemkl-sycl-stats 2024.0.0 onemkl-sycl-vm 2024.0.0 packaging 24.0 pip 24.0 protobuf 4.25.3 psutil 5.9.8 PyYAML 6.0.1 regex 2024.5.15 requests 2.32.2 safetensors 0.4.3 sentencepiece 0.1.99 setuptools 70.0.0 sympy 1.12.1rc1 tbb 2021.12.0 tokenizers 0.19.1 torch 2.2.0 tqdm 4.66.4 transformers 4.41.1 typing_extensions 4.12.0 urllib3 2.2.1 wheel 0.43.0 ```

Actually your issue is irrelevant with llama.cpp, just your machine can't find sycl device. Sadly, we have not meet this issue before and we can't reproduce this issue... Just a suggestion, maybe you can try to install latest driver 5534 (https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) ?

OvaltineSamuel commented 2 months ago

Got it, will try it out later and let you know if it works. Thanks a lot.

OvaltineSamuel commented 2 months ago

Currently, I have the driver updated to the latest 5590. However, I'm not getting any output running ls-sycl-device.exe in the conda env in the llama-cpp folder after initializing init-llama-cpp.bat. Tried multiple times installing a new conda env and following the quickstart. Not sure what's the problem here.