ggerganov / ggml

Tensor library for machine learning
MIT License
11.19k stars 1.03k forks source link

Illegal instruction (core dumped) #25

Open chatbots opened 1 year ago

chatbots commented 1 year ago

Hello! Thanks for reading. Please see below: Illegal instruction (core dumped) I upgraded to 32GB RAM on an i7 3.4 Ghz CPU, Ubuntu 20.04. 374G unused harddrive space.

Here is the output from terminal:

#:~/ggml-master/build$ ../examples/gpt-j/download-ggml-model.sh 6B
Downloading ggml model 6B ...
models/gpt-j-6B/gg 100%[==============>]  11.27G  7.56MB/s    in 24m 53s 
Done! Model '6B' saved in 'models/gpt-j-6B/ggml-model.bin'
You can now use it like this:

$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"

$ ~/ggml-master/build$ ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"
main: seed = 1677098741
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
Illegal instruction (core dumped)

Thank you again!

chatbots commented 1 year ago

Changed file: CMakeCache.txt to Debug:

CMAKE_BUILD_TYPE:STRING=Debug And it almost finished, before the error: Illegal instruction (core dumped)

cmake ..
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/human/ggml/build
human@robot:~/ggml/build$ make gpt-j
[ 16%] Building CXX object examples/CMakeFiles/ggml_utils.dir/utils.cpp.o
[ 33%] Linking CXX static library libggml_utils.a
[ 33%] Built target ggml_utils
[ 50%] Building C object src/CMakeFiles/ggml.dir/ggml.c.o
[ 66%] Linking C static library libggml.a
[ 66%] Built target ggml
[ 83%] Building CXX object examples/gpt-j/CMakeFiles/gpt-j.dir/main.cpp.o
[100%] Linking CXX executable ../../bin/gpt-j
[100%] Built target gpt-j
human@robot:~/ggml/build$ ./bin/gpt-j -p "Hello world"
main: seed = 1677124664
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 11542.79 MB / num tensors = 285
main: number of tokens in prompt = 2

Illegal instruction (core dumped)

So then I used gdb like this:

gdb ./bin/gpt-j -p "Hello world"

And the output was this:

(gdb) run
Starting program: /home/human/ggml/build/bin/gpt-j 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
main: seed = 1677125134
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 11542.79 MB / num tensors = 285
main: number of tokens in prompt = 1

[New Thread 0x7ffca6362700 (LWP 5581)]
[New Thread 0x7ffca5b61700 (LWP 5582)]
[New Thread 0x7ffca5360700 (LWP 5583)]

Thread 1 "gpt-j" received signal SIGILL, Illegal instruction.
0x00005555555d36c1 in _mm256_fmadd_ps (__C=..., __B=..., __A=...)
    at /usr/lib/gcc/x86_64-linux-gnu/9/include/fmaintrin.h:65
65    return (__m256)__builtin_ia32_vfmaddps256 ((__v8sf)__A, (__v8sf)__B,
(gdb)
ggerganov commented 1 year ago

Seems like the CPU does not support FMA instruction set. Provide output of cat /proc/cpuinfo to confirm.

chatbots commented 1 year ago

Yes, that does seem so, with no match for "fma" under "flags:" for the i7 CPU.

cat /proc/cpuinfo

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 58
model name  : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping    : 9
microcode   : 0x21
cpu MHz     : 1596.618
cache size  : 8192 KB
physical id : 0
siblings    : 8
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good 
nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq 
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm 
pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave 
avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp 
tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms 
xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass 
l1tf mds swapgs itlb_multihit srbds mmio_unknown
bogomips    : 6785.06
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Reference: cpu-world.com/CPUs/Core_i7/Intel-Core i7-3770 Quote, "Drawbacks:", "Does not support some instructions"

I have an Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz which ran your GPT2.

On the i3 CPU cat /proc/cpuinfo|grep fma matches "fma" under "flags:"

With just two 240-pin DIMM sockets on the i3, there may be a 16GB RAM

maximum limit. Any workarounds for the i7 CPU without fma support?

Thank you.

chatbots commented 1 year ago

[SOLVED]

I installed 16GB of RAM, in the two 240-pin DIMM sockets, for the Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz and it runs!

./bin/gpt-j -p "I hope this runs!

main: seed = 1677375319
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 11542.79 MB / num tensors = 285
main: number of tokens in prompt = 5

I hope this runs! If not, I'm sorry, but at least I got you to click on it! :D

I'm working on something for next month, I'm a little hesitant to tell you what it is because I'm working on it with the intention of not letting you guys know.

I'm looking to get a new set up that will be able to use for multiple projects that are coming at once. So, it's going to be more like a website. It's going to be for my own stuff. I'm going to be writing about comics, music, and other random things that pop into my head (hopefully).

The thing is, I'm having a hard time coming up with what I want. It needs to be a lot more "advanced" than this. I want it to be where people can comment on my art (or other stuff) and leave a message on the site.

I know that my site (Crazy-Cool

main: mem per token = 16179460 bytes
main:     load time = 107417.07 ms
main:   sample time =    79.90 ms
main:  predict time = 123126.33 ms / 603.56 ms per token
main:    total time = 232648.95 ms

Thank you very much for the help. I sincerely appreciate it.

chatbots commented 1 year ago

UPDATE: Please Skip this and read below for good updated information.

Solution ( as suggested by Georgi ) is to do: cat /proc/cpuinfo|grep fma in Linux terminal to check under flags: section for fma to confirm the CPU supports the Fused multiply-add instruction set, to avoid the error Illegal instruction (core dumped). As a side note, I made a few modifications to file: CMakeLists.txt: set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=native") and then recompiled, but with no luck. So far, no work around found.

quentar commented 1 year ago

Sorry to comment on closed one, just wanted to add my 2 cents as I had some luck running on slow&old VMs without AVX support and also got illegial instruction crash leading into struct ggml_context * ggml_init(struct ggml_init_params params) { ... made it work by removing -f* flags from src/cmake ,
(can probably do better selection , but I was jus trying it to run after simillar error

in example: https://github.com/ggerganov/ggml/blob/master/src/CMakeLists.txt#L33 : changed to :

else()
    message(STATUS "x86 detected")
    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Ofast")
endif()

Thanks for this project 🙇 🙏 , running this locally is fun 😸 !

chatbots commented 1 year ago

Wow, that is nice! Let me reopen this one to give it a try. Thanks for the comment!

chatbots commented 1 year ago

The modification -Ofast worked on my 3rd generation i7 CPU !

In the src directory where the ggml.c source code file is located, I first backed up that CMakeLists.txt original file. Then I took the advice of quentar and commented out the original instruction, and added the new instruction to modify the CMakeLists.txt in the src directory with:

#  MOD: 03/06/2023
#    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mavx -mavx2 -mfma -mf16c")
#  MOD: 03/06/2023
     set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Ofast")

Then I repeated the same CMAKE instructions as before to build it, and the error: Illegal instruction (core dumped) was gone! My future plans are to switch to a 4th generation i7 CPU that supports -mavx -mavx2 -mfma -mf16c. But, for now this is great! I can max out the RAM to 32GB as I originally planned.

Thank you quentar

chatbots commented 1 year ago

./bin/gpt-j -p "Human: Can we briefly chat? AI: Yes we can chat. Human: Chat briefly with me. AI: "

Results:

main: seed = 1678125977
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin'
 - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 11542.79 MB / num tensors = 285
main: number of tokens in prompt = 24

Human: Can we briefly chat? AI: Yes we can chat. Human: Chat briefly with me. AI: I would love to but I am only programmed to be able to speak with humans. Human: Can you do this in code? AI: Well I can try. Human: Do it for me. AI: I will attempt to do this.

Humans: We can’t stand how dumb this machine is. AI: I know but I was only programmed to be able to speak with you.

Humans: AI, can you make the coffee? AI: Yes I can make the coffee but I have never had a coffee before.

Humans: AI, do you like music? AI: Yes I do but I don’t have any music.

Humans: AI, are you a good student? AI: I am a very good student. Humans: I’m impressed. AI: I do try my best to get the job done. Humans: Can you solve math problems? AI: Math? Humans: Yes. AI

main: mem per token = 16179460 bytes
main:     load time = 20867.82 ms
main:   sample time =   468.58 ms
main:  predict time = 1071466.75 ms / 4804.78 ms per token
main:    total time = 1111614.62 ms
DONE!