abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.11k stars 845 forks source link

Intel openAPI (openMKL) CMAKE_ARGS enabled? #1133

Open emulated24 opened 5 months ago

emulated24 commented 5 months ago

Expected Behavior

Pass the oneMKL flags to CMAKE_ARGS and installing llama-cpp-python via pip should finish successfully as the flags are supported by llama.cpp: https://github.com/ggerganov/llama.cpp#intel-onemkl

Current Behavior

Passing the CMAKE_ARGS flags to pip installation produces error:

CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON" FORCE_CMAKE=1 \ pip install llama-cpp-python

[20/22] : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : FAILED: vendor/llama.cpp/examples/llava/libllava.so : && /opt/intel/oneapi/compiler/2024.0/bin/icpx -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libllava.so -o vendor/llama.cpp/examples/llava/libllava.so vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o -Wl,-rpath,/tmp/tmp0dmd36dn/build/vendor/llama.cpp: vendor/llama.cpp/libllama.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so /opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so -lm -ldl && : vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o: file not recognized: file format not recognized icpx: error: linker command failed with exit code 1 (use -v to see invocation)

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            13th Gen Intel(R) Core(TM) i5-1340P
    CPU family:          6
    Model:               186
    Thread(s) per core:  1
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            4377.60
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_
                         known_freq pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibr
                         s ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetb
                         v1 xsaves avx_vnni arat vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    48 MiB (12 instances)
  L3:                    16 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

$ uname -a

Linux ladex 6.7.2-1.el9.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 25 23:07:22 EST 2024 x86_64 x86_64 x86_64 GNU/Linux

$ python3 --version
$ make --version
$ g++ --version
$ cmake --version
$ icx --version
Python 3.11.7
GNU Make 4.3
g++ (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7)
cmake version 3.20.2
Intel(R) oneAPI DPC++/C++ Compiler 2024.0.2 (2024.0.2.20231213)

Building llama.cpp directly with oneAPI works fine and performs 2x better than with BLIS and ~2.8x better than clean (not customized) build via "pip install llama-cpp-python".

Commands used to build llama.cpp direcly with oneAPI:

source /opt/intel/oneapi/setvars.sh
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON
cmake --build . --config Release
abetlen commented 5 months ago

Can you try building llama.cpp as a shared library with:

source /opt/intel/oneapi/setvars.sh
cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON -DBUILD_SHARED_LIBS=ON
cmake --build . --config Release
emulated24 commented 5 months ago

I did as described, works fine, compilation completed without any errors.

Log of command: cmake .. -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=Intel10_64lp -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_NATIVE=ON -DBUILD_SHARED_LIBS=ON

-- The C compiler identification is IntelLLVM 2024.0.2
-- The CXX compiler identification is IntelLLVM 2024.0.2
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/intel/oneapi/compiler/2024.0/bin/icx - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/intel/oneapi/compiler/2024.0/bin/icpx - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.39.3")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so;/opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so;-lm;-ldl
-- BLAS found, Libraries: /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_lp64.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_thread.so;/opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so;/opt/intel/oneapi/compiler/2024.0/lib/libiomp5.so;-lm;-ldl
-- Found PkgConfig: /usr/bin/pkg-config (found version "1.7.3")
-- Checking for module 'mkl-sdl'
--   Found mkl-sdl, version 2024
-- BLAS found, Includes: /opt/intel/oneapi/mkl/2024.0/lib/pkgconfig/../../include
-- Warning: ccache not found - consider installing it or use LLAMA_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: <path>/lcpp_oneapi/llama.cpp/build

However, installing llama-cpp-python with the same flags fails again on this error:

vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o: file not recognized: file format not recognized
Longhao-Chen commented 5 months ago

I also encountered the same problem.

sir3mat commented 4 months ago

I also encountered the same problem

pascal-ain commented 4 months ago

Same here