ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
34.5k stars 3.52k forks source link

make main error ggml-quants.o on Oracle Ampere VPS #1975

Open Prephi opened 6 months ago

Prephi commented 6 months ago

Excuse the question, I'm trying for several days now and I'm really inexperienced with the makefile scripts. I know that the compilation should work (as seen with the User FlippFuzz - but he already merged his solution for the Oracle Ampere)

I'm trying to build the main example, the make clean/ make output is as follows

root@instance-20240307-2028:~/whisper.cpp# make clean
I whisper.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  aarch64
I UNAME_M:  aarch64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native
I LDFLAGS:  
I CC:       cc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
I CXX:      g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0

rm -f *.o main stream command talk talk-llama bench quantize server lsp libwhisper.a libwhisper.so
root@instance-20240307-2028:~/whisper.cpp# make
I whisper.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  aarch64
I UNAME_M:  aarch64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native
I LDFLAGS:  
I CC:       cc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
I CXX:      g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native   -c ggml.c -o ggml.o
cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native   -c ggml-alloc.c -o ggml-alloc.o
cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native   -c ggml-backend.c -o ggml-backend.o
cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -mcpu=native   -c ggml-quants.c -o ggml-quants.o
ggml-quants.c: In function ‘ggml_vec_dot_q2_K_q8_K’:
ggml-quants.c:504:27: warning: implicit declaration of function ‘vld1q_s16_x2’; did you mean ‘vld1q_s16’? [-Wimplicit-function-declaration]
 #define ggml_vld1q_s16_x2 vld1q_s16_x2
                           ^
ggml-quants.c:5008:41: note: in expansion of macro ‘ggml_vld1q_s16_x2’
         const ggml_int16x8x2_t q8sums = ggml_vld1q_s16_x2(y[i].bsums);
                                         ^~~~~~~~~~~~~~~~~
ggml-quants.c:504:27: error: invalid initializer
 #define ggml_vld1q_s16_x2 vld1q_s16_x2
                           ^
ggml-quants.c:5008:41: note: in expansion of macro ‘ggml_vld1q_s16_x2’
         const ggml_int16x8x2_t q8sums = ggml_vld1q_s16_x2(y[i].bsums);
                                         ^~~~~~~~~~~~~~~~~
ggml-quants.c:505:27: warning: implicit declaration of function ‘vld1q_u8_x2’; did you mean ‘vld1q_u32’? [-Wimplicit-function-declaration]
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:5032:46: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             const ggml_uint8x16x2_t q2bits = ggml_vld1q_u8_x2(q2); q2 += 32;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:5032:46: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             const ggml_uint8x16x2_t q2bits = ggml_vld1q_u8_x2(q2); q2 += 32;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c:507:27: warning: implicit declaration of function ‘vld1q_s8_x2’; did you mean ‘vld1q_s32’? [-Wimplicit-function-declaration]
 #define ggml_vld1q_s8_x2  vld1q_s8_x2
                           ^
ggml-quants.c:5034:40: note: in expansion of macro ‘ggml_vld1q_s8_x2’
             ggml_int8x16x2_t q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:507:27: error: invalid initializer
 #define ggml_vld1q_s8_x2  vld1q_s8_x2
                           ^
ggml-quants.c:5034:40: note: in expansion of macro ‘ggml_vld1q_s8_x2’
             ggml_int8x16x2_t q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:5026:17: error: incompatible types when assigning to type ‘int8x16x2_t {aka struct int8x16x2_t}’ from type ‘int’
         q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;\
                 ^
ggml-quants.c:5040:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(2, 2);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-quants.c:5026:17: error: incompatible types when assigning to type ‘int8x16x2_t {aka struct int8x16x2_t}’ from type ‘int’
         q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;\
                 ^
ggml-quants.c:5041:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(4, 4);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-quants.c:5026:17: error: incompatible types when assigning to type ‘int8x16x2_t {aka struct int8x16x2_t}’ from type ‘int’
         q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;\
                 ^
ggml-quants.c:5042:13: note: in expansion of macro ‘SHIFT_MULTIPLY_ACCUM_WITH_SCALE’
             SHIFT_MULTIPLY_ACCUM_WITH_SCALE(6, 6);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-quants.c: In function ‘ggml_vec_dot_q3_K_q8_K’:
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:5658:36: note: in expansion of macro ‘ggml_vld1q_u8_x2’
         ggml_uint8x16x2_t qhbits = ggml_vld1q_u8_x2(qh);
                                    ^~~~~~~~~~~~~~~~
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:5676:46: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             const ggml_uint8x16x2_t q3bits = ggml_vld1q_u8_x2(q3); q3 += 32;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c:508:27: warning: implicit declaration of function ‘vld1q_s8_x4’; did you mean ‘vld1q_s64’? [-Wimplicit-function-declaration]
 #define ggml_vld1q_s8_x4  vld1q_s8_x4
                           ^
ggml-quants.c:5677:48: note: in expansion of macro ‘ggml_vld1q_s8_x4’
             const ggml_int8x16x4_t q8bytes_1 = ggml_vld1q_s8_x4(q8); q8 += 64;
                                                ^~~~~~~~~~~~~~~~
ggml-quants.c:508:27: error: invalid initializer
 #define ggml_vld1q_s8_x4  vld1q_s8_x4
                           ^
ggml-quants.c:5677:48: note: in expansion of macro ‘ggml_vld1q_s8_x4’
             const ggml_int8x16x4_t q8bytes_1 = ggml_vld1q_s8_x4(q8); q8 += 64;
                                                ^~~~~~~~~~~~~~~~
ggml-quants.c:508:27: error: invalid initializer
 #define ggml_vld1q_s8_x4  vld1q_s8_x4
                           ^
ggml-quants.c:5678:48: note: in expansion of macro ‘ggml_vld1q_s8_x4’
             const ggml_int8x16x4_t q8bytes_2 = ggml_vld1q_s8_x4(q8); q8 += 64;
                                                ^~~~~~~~~~~~~~~~
ggml-quants.c: In function ‘ggml_vec_dot_q4_K_q8_K’:
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:6547:46: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             const ggml_uint8x16x2_t q4bits = ggml_vld1q_u8_x2(q4); q4 += 32;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c:6549:21: error: incompatible types when assigning to type ‘int8x16x2_t {aka struct int8x16x2_t}’ from type ‘int’
             q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;
                     ^
ggml-quants.c:6556:21: error: incompatible types when assigning to type ‘int8x16x2_t {aka struct int8x16x2_t}’ from type ‘int’
             q8bytes = ggml_vld1q_s8_x2(q8); q8 += 32;
                     ^
ggml-quants.c: In function ‘ggml_vec_dot_q5_K_q8_K’:
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:7153:36: note: in expansion of macro ‘ggml_vld1q_u8_x2’
         ggml_uint8x16x2_t qhbits = ggml_vld1q_u8_x2(qh);
                                    ^~~~~~~~~~~~~~~~
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:7161:46: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             const ggml_uint8x16x2_t q5bits = ggml_vld1q_u8_x2(q5); q5 += 32;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c:508:27: error: invalid initializer
 #define ggml_vld1q_s8_x4  vld1q_s8_x4
                           ^
ggml-quants.c:7162:46: note: in expansion of macro ‘ggml_vld1q_s8_x4’
             const ggml_int8x16x4_t q8bytes = ggml_vld1q_s8_x4(q8); q8 += 64;
                                              ^~~~~~~~~~~~~~~~
ggml-quants.c: In function ‘ggml_vec_dot_q6_K_q8_K’:
ggml-quants.c:504:27: error: invalid initializer
 #define ggml_vld1q_s16_x2 vld1q_s16_x2
                           ^
ggml-quants.c:7829:41: note: in expansion of macro ‘ggml_vld1q_s16_x2’
         const ggml_int16x8x2_t q8sums = ggml_vld1q_s16_x2(y[i].bsums);
                                         ^~~~~~~~~~~~~~~~~
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:7843:40: note: in expansion of macro ‘ggml_vld1q_u8_x2’
             ggml_uint8x16x2_t qhbits = ggml_vld1q_u8_x2(qh); qh += 32;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:506:27: warning: implicit declaration of function ‘vld1q_u8_x4’; did you mean ‘vld1q_u64’? [-Wimplicit-function-declaration]
 #define ggml_vld1q_u8_x4  vld1q_u8_x4
                           ^
ggml-quants.c:7844:40: note: in expansion of macro ‘ggml_vld1q_u8_x4’
             ggml_uint8x16x4_t q6bits = ggml_vld1q_u8_x4(q6); q6 += 64;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:506:27: error: invalid initializer
 #define ggml_vld1q_u8_x4  vld1q_u8_x4
                           ^
ggml-quants.c:7844:40: note: in expansion of macro ‘ggml_vld1q_u8_x4’
             ggml_uint8x16x4_t q6bits = ggml_vld1q_u8_x4(q6); q6 += 64;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:508:27: error: invalid initializer
 #define ggml_vld1q_s8_x4  vld1q_s8_x4
                           ^
ggml-quants.c:7845:40: note: in expansion of macro ‘ggml_vld1q_s8_x4’
             ggml_int8x16x4_t q8bytes = ggml_vld1q_s8_x4(q8); q8 += 64;
                                        ^~~~~~~~~~~~~~~~
ggml-quants.c:7870:21: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8bytes = ggml_vld1q_s8_x4(q8); q8 += 64;
                     ^
ggml-quants.c: In function ‘ggml_vec_dot_iq2_xxs_q8_K’:
ggml-quants.c:8599:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq2_xs_q8_K’:
ggml-quants.c:8737:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq2_s_q8_K’:
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:9000:37: note: in expansion of macro ‘ggml_vld1q_u8_x2’
     const ggml_uint8x16x2_t mask1 = ggml_vld1q_u8_x2(k_mask1);
                                     ^~~~~~~~~~~~~~~~
ggml-quants.c:9021:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq3_xxs_q8_K’:
ggml-quants.c:9213:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq3_s_q8_K’:
ggml-quants.c:505:27: error: invalid initializer
 #define ggml_vld1q_u8_x2  vld1q_u8_x2
                           ^
ggml-quants.c:9345:37: note: in expansion of macro ‘ggml_vld1q_u8_x2’
     const ggml_uint8x16x2_t mask1 = ggml_vld1q_u8_x2(k_mask1);
                                     ^~~~~~~~~~~~~~~~
ggml-quants.c:9378:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq1_s_q8_K’:
ggml-quants.c:9604:17: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b = ggml_vld1q_s8_x4(q8); q8 += 64;
                 ^
ggml-quants.c: In function ‘ggml_vec_dot_iq4_xs_q8_K’:
ggml-quants.c:9828:20: error: incompatible types when assigning to type ‘uint8x16x2_t {aka struct uint8x16x2_t}’ from type ‘int’
             q4bits = ggml_vld1q_u8_x2(q4); q4 += 32;
                    ^
ggml-quants.c:9829:20: error: incompatible types when assigning to type ‘int8x16x4_t {aka struct int8x16x4_t}’ from type ‘int’
             q8b    = ggml_vld1q_s8_x4(q8); q8 += 64;
                    ^
Makefile:326: recipe for target 'ggml-quants.o' failed
make: *** [ggml-quants.o] Error 1

The system is the standard Oracle free tier: Shape: VM.Standard.A1.Flex OCPU count: 4 Image: Canonical-Ubuntu-18.04-aarch64-2023.05.10-0

The error -for me- seems like the system is not recognized and wrong declarations are used. I tried to change the declarations manually, but am a bit overwhelmed by the declarations in the makefile. The building process was never successful.

Do you have an idea, that I could try?

jacobhq commented 6 months ago

Experiencing same issue on an NVIDIA Jetson Nano

skmkt commented 5 months ago

NEON DOT product ISA extension not recognized. Try upgrade gcc version - upgrade OS ubuntu 22.04, or use a docker image

primenko-v commented 5 months ago

I had the same issue on NVIDIA Jetson Nano, and it helped to update gcc to version 8.5 and to use CMake. It might also work with gcc-9, which is much easier to install than gcc-8.5.

Option 1. Installing gcc-8.5.0 from source

This takes several hours, so it might be a good idea to try installing gcc-9 first (see Option 2)

  1. Create some temporary directory and navigate to it

    mkdir ~/aux
    cd ~/aux
  2. Download gcc-8.5.0

    wget https://ftp.gnu.org/gnu/gcc/gcc-8.5.0/gcc-8.5.0.tar.gz
  3. Compile gcc-8.5.0 according to https://gcc.gnu.org/wiki/InstallingGCC

        tar -xzf gcc-8.5.0.tar.gz
        cd gcc-8.5.0
        ./contrib/download_prerequisites
        mkdir objdir
        cd objdir
        $PWD/../gcc-8.5.0/configure --enable-languages=c,c++,fortran
        make
        sudo make install

Option 2. Installing gcc-9 via apt

    sudo add-apt-repository ppa:ubuntu-toolchain-r/test
    sudo apt update
    sudo apt install gcc-9 g++-9

Building Whisper

  1. Navigate to whisper_cpp repo
  2. Build with CMake:

    mkdir build
    cd build
    cmake -DWHISPER_CUBLAS=1 -DCMAKE_C_COMPILER=<path-to-gcc> -DCMAKE_CXX_COMPILER=<path-to-g++> -DCMAKE_LINKER=<path-to-g++> ..
    make -j4

    You can find the gcc and g++ paths by running which gcc (or which gcc-<version>). If everything is correct, you should be able to see the new gcc and g++ version when building Whisper:

    -- The C compiler identification is GNU 8.5.0
    -- The CXX compiler identification is GNU 8.5.0

Also, I am not really sure if this part in the cmake command is needed:

-DCMAKE_LINKER=<path-to-g++>