emulated24 commented 5 months ago

Expected Behavior

Pass the oneMKL flags to CMAKE_ARGS and installing llama-cpp-python via pip should finish successfully as the flags are supported by llama.cpp: https://github.com/ggerganov/llama.cpp#intel-onemkl

Current Behavior

Passing the CMAKE_ARGS flags to pip installation produces error:

CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON" FORCE_CMAKE=1 \ pip install llama-cpp-python

[20/22] : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : FAILED: vendor/llama.cpp/examples/llava/libllava.so : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o: file not recognized: file format not recognized icpx: error: linker command failed with exit code 1 (use -v to see invocation)

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            13th Gen Intel(R) Core(TM) i5-1340P
    CPU family:          6
    Model:               186
    Thread(s) per core:  1
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            4377.60
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_
                         known_freq pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibr
                         s ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetb
                         v1 xsaves avx_vnni arat vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    48 MiB (12 instances)
  L3:                    16 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Operating System, e.g. for Linux:

$ uname -a

Linux ladex 6.7.2-1.el9.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 25 23:07:22 EST 2024 x86_64 x86_64 x86_64 GNU/Linux

SDK version, e.g. for Linux:

$ python3 --version
$ make --version
$ g++ --version
$ cmake --version
$ icx --version

Python 3.11.7
GNU Make 4.3
g++ (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7)
cmake version 3.20.2
Intel(R) oneAPI DPC++/C++ Compiler 2024.0.2 (2024.0.2.20231213)

Building llama.cpp directly with oneAPI works fine and performs 2x better than with BLIS and ~2.8x better than clean (not customized) build via "pip install llama-cpp-python".

Commands used to build llama.cpp direcly with oneAPI:

source /opt/intel/oneapi/setvars.sh
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON
cmake --build . --config Release

abetlen commented 5 months ago

Can you try building llama.cpp as a shared library with:

source /opt/intel/oneapi/setvars.sh
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON -DBUILD_SHARED_LIBS=ON
cmake --build . --config Release

emulated24 commented 5 months ago

I did as described, works fine, compilation completed without any errors.

Log of command: cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON -DBUILD_SHARED_LIBS=ON

-- The C compiler identification is IntelLLVM 2024.0.2
-- The CXX compiler identification is IntelLLVM 2024.0.2
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/intel/oneapi/compiler/2024.0/bin/icx - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/intel/oneapi/compiler/2024.0/bin/icpx - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.39.3")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so;/opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so;-lm;-ldl
-- BLAS found, Libraries: /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so;/opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so;-lm;-ldl
-- Found PkgConfig: /usr/bin/pkg-config (found version "1.7.3")
-- Checking for module 'mkl-sdl'
--   Found mkl-sdl, version 2024
-- BLAS found, Includes: /opt/intel/oneapi/mkl/2024.0/lib/pkgconfig/../../include
-- Warning: ccache not found - consider installing it or use LLAMA_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: <path>/lcpp_oneapi/llama.cpp/build

However, installing llama-cpp-python with the same flags fails again on this error:

vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o: file not recognized: file format not recognized

Longhao-Chen commented 5 months ago

I also encountered the same problem.

sir3mat commented 4 months ago

I also encountered the same problem

pascal-ain commented 4 months ago

Same here

abetlen / llama-cpp-python

Intel openAPI (openMKL) CMAKE_ARGS enabled? #1133

Expected Behavior

Current Behavior

Environment and Context