ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.85k stars 9.45k forks source link

cmake: Raspberry Pi 3 (raspian) compile fails #1210

Closed jtang613 closed 1 year ago

jtang613 commented 1 year ago

`

uname -a

Linux raspberrypi 5.15.84-v7+ #1613 SMP Thu Jan 5 11:59:48 GMT 2023 armv7l GNU/Linux

lscpu

Architecture: armv7l Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Vendor ID: ARM Model: 4 Model name: Cortex-A53 Stepping: r0p4 CPU max MHz: 1400.0000 CPU min MHz: 600.0000 BogoMIPS: 38.40 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32

`

`

make

I llama.cpp build info: I UNAME_S: Linux I UNAME_P: unknown I UNAME_M: armv7l I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread I LDFLAGS:
I CC: cc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110 I CXX: g++ (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations -c ggml.c -o ggml.o ggml.c: In function ‘quantize_row_q8_1’: ggml.c:1535:36: warning: implicit declaration of function ‘vcvtnq_s32_f32’; did you mean ‘vcvtq_s32_f32’? [-Wimplicit-function-declaration] 1535 | const int32x4_t vi = vcvtnq_s32_f32(v); | ^~~~~~ | vcvtq_s32_f32 ggml.c:1535:36: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int’ ggml.c:1548:36: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int’ 1548 | const int32x4_t vi = vcvtnq_s32_f32(v); | ^~~~~~ ggml.c: In function ‘ggml_vec_dot_q4_0_q8_0’: ggml.c:2756:34: warning: implicit declaration of function ‘vuzp1q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration] 2756 | const int8x16_t v1_0ls = vuzp1q_s8(v1_0l, v1_0h); | ^~~~~ | vuzpq_s8 ggml.c:2756:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2757:34: warning: implicit declaration of function ‘vuzp2q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration] 2757 | const int8x16_t v1_0hs = vuzp2q_s8(v1_0l, v1_0h); | ^~~~~ | vuzpq_s8 ggml.c:2757:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2758:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2758 | const int8x16_t v1_1ls = vuzp1q_s8(v1_1l, v1_1h); | ^~~~~ ggml.c:2759:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2759 | const int8x16_t v1_1hs = vuzp2q_s8(v1_1l, v1_1h); | ^~~~~ ggml.c: In function ‘ggml_vec_dot_q4_1_q8_1’: ggml.c:2917:34: warning: implicit declaration of function ‘vzip1q_s8’; did you mean ‘vzip1_s8’? [-Wimplicit-function-declaration] 2917 | const int8x16_t v0_0lz = vzip1q_s8(v0_0l, v0_0h); | ^~~~~ | vzip1_s8 ggml.c:2917:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2918:34: warning: implicit declaration of function ‘vzip2q_s8’; did you mean ‘vzip2_s8’? [-Wimplicit-function-declaration] 2918 | const int8x16_t v0_0hz = vzip2q_s8(v0_0l, v0_0h); | ^~~~~ | vzip2_s8 ggml.c:2918:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2919:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2919 | const int8x16_t v0_1lz = vzip1q_s8(v0_1l, v0_1h); | ^~~~~ ggml.c:2920:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2920 | const int8x16_t v0_1hz = vzip2q_s8(v0_1l, v0_1h); | ^~~~~ ggml.c: In function ‘ggml_vec_dot_q4_2_q8_0’: ggml.c:3057:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3057 | const int8x16_t v0_0lz = vzip1q_s8(v0_0ls, v0_0hs); | ^~~~~ ggml.c:3058:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3058 | const int8x16_t v0_0hz = vzip2q_s8(v0_0ls, v0_0hs); | ^~~~~ ggml.c:3059:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3059 | const int8x16_t v0_1lz = vzip1q_s8(v0_1ls, v0_1hs); | ^~~~~ ggml.c:3060:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3060 | const int8x16_t v0_1hz = vzip2q_s8(v0_1ls, v0_1hs); | ^~~~~ ggml.c: In function ‘ggml_vec_dot_q4_3_q8_1’: ggml.c:3205:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3205 | const int8x16_t v0_0lz = vzip1q_s8(v0_0l, v0_0h); | ^~~~~ ggml.c:3206:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3206 | const int8x16_t v0_0hz = vzip2q_s8(v0_0l, v0_0h); | ^~~~~ ggml.c: In function ‘ggml_vec_dot_q5_0_q8_0’: ggml.c:3343:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3343 | const int8x16_t v0lz = vzip1q_s8(v0l, v0h); | ^~~~~ ggml.c:3344:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3344 | const int8x16_t v0hz = vzip2q_s8(v0l, v0h); | ^~~~~ ggml.c: In function ‘ggml_vec_dot_q5_1_q8_1’: ggml.c:3474:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3474 | const int8x16_t v0lz = vzip1q_s8(v0l, v0h); | ^~~~~ ggml.c:3475:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3475 | const int8x16_t v0hz = vzip2q_s8(v0l, v0h); | ^~~~~ make: *** [Makefile:161: ggml.o] Error 1

`

Azeirah commented 1 year ago

Looks like gcc is missing some implementations for various arm-specific SIMD instructions

See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95399

I found that it can be poly filled like they do in XNNPACK: https://github.com/google/XNNPACK/issues/1924#issuecomment-930139286

As for you @jtang613, you can try compiling with clang, it should have these functions defined

prusnak commented 1 year ago

Can you try again and post the output after git pull? There are new commits https://github.com/ggerganov/llama.cpp/commit/e8c051611abfc9a7f37fd4bba48217180893bd68 and https://github.com/ggerganov/llama.cpp/commit/c3ca7a5f0546c561eb278be3f2fe335795679e01 which fix lots of the issues you mentioned

jtang613 commented 1 year ago

Thanks @prusnak , the latest merge builds using 'make'.

fwiw, cmake still complains (see below). But as long as one method works, I'm happy. `~/llama.cpp/build# cmake .. -- The C compiler identification is GNU 10.2.1 -- The CXX compiler identification is GNU 10.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Looking for pthread.h -- Looking for pthread.h - found -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Check if compiler accepts -pthread -- Check if compiler accepts -pthread - yes -- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: armv7l -- ARM detected -- Configuring done -- Generating done -- Build files have been written to: /root/llama.cpp/build

root@raspberrypi:~/llama.cpp/build# cmake --build . --config Release Scanning dependencies of target ggml [ 3%] Building C object CMakeFiles/ggml.dir/ggml.c.o /root/llama.cpp/ggml.c:191:10: fatal error: immintrin.h: No such file or directory 191 | #include | ^~~~~ compilation terminated. gmake[2]: [CMakeFiles/ggml.dir/build.make:82: CMakeFiles/ggml.dir/ggml.c.o] Error 1 gmake[1]: [CMakeFiles/Makefile2:797: CMakeFiles/ggml.dir/all] Error 2 gmake: *** [Makefile:114: all] Error 2 `

SlyEcho commented 1 year ago

The CMake file is not complete for 32-bit ARM:

if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "arm" OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64")
    message(STATUS "ARM detected")
    if (MSVC)
        # TODO: arm msvc?
    else()
        if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64")
            add_compile_options(-mcpu=native)
        endif()
        # TODO: armv6,7,8 version specific flags
    endif()

But it's possible to add the same flags in the configuration manually:

cmake . -DCMAKE_C_FLAGS="-mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations"
prusnak commented 1 year ago

Added fix in https://github.com/ggerganov/llama.cpp/pull/1251

@jtang613 could you please test that PR if it fixes the issue for you using cmake?

0x07CB commented 10 months ago

Please, Anyone can confirm to me if build work fine on raspberry pi 3 under the OS " raspbian 32bits ".

Note: I want know if I can start a AI with low costing hardware ( using a lot of Raspberry Pi in a cluster setup )...