Open ch1y0q opened 1 week ago
@ch1y0q what is your ldd ./build/bin/llama-cli
output?
@ch1y0q what is your
ldd ./build/bin/llama-cli
output?
linux-vdso.so.1 (0x00007ffe891b3000)
libllama.so => /home/arda/qiyue/llama.cpp/build/src/libllama.so (0x0000754dafb74000)
libggml.so => /home/arda/qiyue/llama.cpp/build/ggml/src/libggml.so (0x0000754daf600000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000754daf200000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000754dafa79000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000754dafa59000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000754daee00000)
libsvml.so => /opt/intel/oneapi/compiler/2024.0/lib/libsvml.so (0x0000754dad600000)
libirng.so => /opt/intel/oneapi/compiler/2024.0/lib/libirng.so (0x0000754daf506000)
libimf.so => /opt/intel/oneapi/compiler/2024.0/lib/libimf.so (0x0000754dad000000)
libintlc.so.5 => /opt/intel/oneapi/compiler/2024.0/lib/libintlc.so.5 (0x0000754daf4a5000)
/lib64/ld-linux-x86-64.so.2 (0x0000754dafcf4000)
libOpenCL.so.1 => /opt/intel/oneapi/compiler/2024.0/opt/oclfpga/host/linux64/lib/libOpenCL.so.1 (0x0000754dacc00000)
libmkl_core.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so.2 (0x0000754da8a00000)
libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 (0x0000754daf103000)
libmkl_sycl_blas.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4 (0x0000754da3400000)
libmkl_intel_ilp64.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_ilp64.so.2 (0x0000754da2200000)
libmkl_tbb_thread.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_tbb_thread.so.2 (0x0000754da0400000)
libiomp5.so => /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so (0x0000754d9fe00000)
libsycl.so.7 => /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7 (0x0000754d9fa00000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000754dafa50000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000754dafa4b000)
libtbb.so.12 => /opt/intel/oneapi/tbb/2021.11/env/../lib/intel64/gcc4.8/libtbb.so.12 (0x0000754d9f600000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x0000754dafa44000)
remove the folder builld
.
rebuild again.
remove the folder
builld
. rebuild again.
I rebuilt with
cmake -B build -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
cmake --build build --config Release -j -v
Same bug.
try with the recommended release: https://github.com/luoyu-intel/llama.cpp/blob/master/docs/backend/SYCL.md#recommended-release
git checkout fb76ec31a9914b7761c1727303ab30380fd4f05c
What happened?
I am using Llama.cpp + SYCL to perform inference with Qwen2 MoE. The prediction output seems normal, but the following lines in the debug log indicates that the model is not offloaded to GPU at all.
command:
ZES_ENABLE_SYSMAN=0 ./build/bin/llama-cli -m ./Qwen1.5-MoE-A2.7B-Chat.Q4_0.gguf -p "Can you tell me what is a CPU?" -n 400 -e -ngl 33 -s 0 -sm none -mg 0
Name and Version
version: 3337 (a8db2a9c) built with Intel(R) oneAPI DPC++/C++ Compiler 2024.0.1 (2024.0.1.20231122) for x86_64-unknown-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output