ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.98k stars 3.67k forks source link

armv7l crashes, out of box Illegal instruction, with tweaking Killed #470

Open clach04 opened 1 year ago

clach04 commented 1 year ago

I have a 32-bit arm7 CPU on an Odroid-XU4 with Armbian 21.08.6 Focal with Linux 5.4.160-odroidxu4

Out of box make crashes:

./main -m models/ggml-medium.bin  -otxt samples/jfk.wav
....
Illegal instruction

Looking at the makefile, looks like it's using the wrong FPU lib for floating point math.

https://github.com/ggerganov/whisper.cpp/blob/ab1916fc598cc364b521a6d24752c4b092553e40/Makefile#L149

Introduced in https://github.com/ggerganov/whisper.cpp/commit/167324584b0927fa78d696743d29f0ff29bebfe9 via https://github.com/ggerganov/whisper.cpp/pull/23

Tweaked which but then crashes out with Killed instead:

diff --git a/Makefile b/Makefile
index 20915e3..f20bb5d 100644
--- a/Makefile
+++ b/Makefile
@@ -147,8 +147,11 @@ ifneq ($(filter armv6%,$(UNAME_M)),)
        CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access
 endif
 ifneq ($(filter armv7%,$(UNAME_M)),)
+       # this label looks wrong - matches 71 which is ONLY 32-bit
        # Raspberry Pi 4
-       CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
+       #CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
+       #  cc: note: valid arguments to ‘-mfpu=’ are: auto crypto-neon-fp-armv8 fp-armv8 fpv4-sp-d16 fpv5-d16 fpv5-sp-d16 neon neon-fp-armv8 neon-fp16 neon-vfpv3 neon-vfpv4 vfp vfp3 vfpv2 vfpv3 vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd vfpv3xd-fp16 vfpv4 vfpv4-d16; did you mean ‘neon-fp-armv8’?
+       CFLAGS += -mfpu=neon -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
 endif
 ifneq ($(filter armv8%,$(UNAME_M)),)
        # Raspberry Pi 4

$ ./main -m models/ggml-medium.bin  -otxt samples/jfk.wav
whisper_init_from_file: loading model from 'models/ggml-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 4
whisper_model_load: mem required  = 1720.00 MB (+   43.00 MB per decoder)
whisper_model_load: kv self size  =   42.00 MB
whisper_model_load: kv cross size =  140.62 MB
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 1462.35 MB
Killed

platform info:

$ uname -a
Linux odroidxu4 5.4.160-odroidxu4 #21.08.6 SMP PREEMPT Mon Nov 22 12:18:25 UTC 2021 armv7l armv7l armv7l GNU/Linux

$ cat /proc/cpuinfo  
processor       : 0
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 36.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 3
...
...
processor       : 7
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 36.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0xc0f
CPU revision    : 3

Hardware        : Hardkernel ODROID-XU4
Revision        : 0100
Serial          : 0000000000000000

I've tried without a mfpu flag but compilation fails:

ggml.c:153:10: fatal error: immintrin.h: No such file or directory
  153 | #include <immintrin.h>

I'm in the process of trying a list, posting in case anyone else hits this (and/or has ideas).

clach04 commented 1 year ago

The Killed was a red herring - my SBC ram out of memory using the Medium Model! :rofl:

The small and tiny model worked fine once the fpu was switched to plain neon.

make small
./main -m models/ggml-small.bin -f samples/jfk.wav

(albeit slowly).

clach04 commented 1 year ago

@ggerganov I posted a comment on change https://github.com/ggerganov/whisper.cpp/commit/167324584b0927fa78d696743d29f0ff29bebfe9#r99400156 - have you had any success with 32-bit ARM with the out-of box Makefile?

I can make a PR but want to make sure I'm not destabilizing for other people (or missing something).

ggerganov commented 1 year ago

There might be issues on 32-bit platforms - need to see if there are some 64-bit assumptions in ggml.c. Regarding the Makefile - it is very likely that the current flags are not appropriate for 32-bit ARM. If you find something that works - let me know and will update it as appropriate

clach04 commented 1 year ago

Posted https://github.com/ggerganov/whisper.cpp/pull/486 that works for me :-)