Open dRAT3 opened 1 year ago
I do have 2 cpus maybe that has something to do with it?
I can't really reproduce it, I tried pure CPU on linux and windows with the same model and commandline. Appears to work fine
Can you try run in debug mode, to see if you get more out of it. If you know how to use gdb a backtrace would be useful
Starting program: /home/barry/evo/ggllm.cpp-1/build/bin/falcon_main -t 11 -m /home/barry/evo/falcon/wizard-falcon40b.ggmlv3.q4_K_M.bin -p This\ is\ your\ first\ run\ here\ what\ do\ you\ wanna\ say\? -n 512
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main (argc=9, argv=0x7fffffffdf58) at /home/barry/evo/ggllm.cpp-1/examples/falcon/falcon_main.cpp:51
51 int main(int argc, char ** argv) {
(gdb) s
52 gpt_params params;
(gdb) s
gpt_params::gpt_params (this=0x7fffffffc880) at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.h:23
23 struct gpt_params {
(gdb) s
get_num_physical_cores () at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.cpp:31
31 int32_t get_num_physical_cores() {
(gdb) s 30
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::replace (__n2=23, __s=0x5555555c8fd7 "/sys/devices/system/cpu", __n1=0, __pos=0, this=0x7fffffffc2f0) at /usr/include/c++/11/bits/basic_string.h:1956
1956 replace(size_type __pos, size_type __n1, const _CharT* __s,
(gdb) s 30
0x000055555556714e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider (__a=..., __dat=0x7fffffffc340 "", this=0x7fffffffc330) at /usr/include/c++/11/bits/basic_string.h:168
168 : allocator_type(std::move(__a)), _M_p(__dat) { }
(gdb) s 30
std::basic_ifstream<char, std::char_traits<char> >::basic_ifstream (__mode=std::_S_in, __s="/sys/devices/system/cpu0/topology/thread_siblings", this=0x7fffffffc350, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /usr/include/c++/11/fstream:569
569 : __istream_type(), _M_filebuf()
(gdb) s 30
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_is_local (this=0x7fffffffc2f0) at /usr/include/c++/11/bits/basic_string.h:230
230 { return _M_data() == _M_local_data(); }
(gdb) s 30
462 _M_begin() const
(gdb) s 10
132 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
(gdb) s
133 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
__memset_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:368
368 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
369 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
370 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
402 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
403 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
__memset_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:404
404 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(gdb) s
std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::~_Hashtable (this=0x7fffffffc2b0, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/hashtable.h:1532
1532 clear();
(gdb) s
0x000055555556bb7e in std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::clear (
this=0x7fffffffc2b0) at /usr/include/c++/11/bits/hashtable.h:2323
2323 _M_element_count = 0;
(gdb) s
std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::~_Hashtable (this=0x7fffffffc2b0, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/hashtable.h:1533
1533 _M_deallocate_buckets();
(gdb) s
std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_deallocate_buckets (this=0x7fffffffc2b0)
at /usr/include/c++/11/bits/hashtable.h:454
454 { _M_deallocate_buckets(_M_buckets, _M_bucket_count); }
(gdb) s
std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_deallocate_buckets (__bkt_count=1,
__bkts=0x7fffffffc2e0, this=0x7fffffffc2b0) at /usr/include/c++/11/bits/hashtable.h:421
421 _M_uses_single_bucket(__buckets_ptr __bkts) const
(gdb) s
gpt_params::gpt_params (this=0x7fffffffc880) at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.h:23
23 struct gpt_params {
(gdb) s
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=<optimized out>,
this=<optimized out>) at /usr/include/c++/11/bits/basic_string.h:539
539 _M_construct(__s, __end, random_access_iterator_tag());
(gdb) s
0x0000555555563570 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (
__end=<optimized out>, __beg=<optimized out>, this=<optimized out>) at /usr/include/c++/11/bits/basic_string.tcc:219
219 _M_data(_M_create(__dnew, size_type(0)));
(gdb) s
Program received signal SIGILL, Illegal instruction.
0x0000555555563573 in gpt_params::gpt_params (this=0x7fffffffc880) at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.h:23
23 struct gpt_params {
(gdb) s
Program terminated with signal SIGILL, Illegal instruction.
The program no longer exists.
@dRAT3 Instead of single-stepping you can just continue
so it stops at the SIGILL, and then use bt
and disas
to show the call stack and disassembly at the point where it crashed.
Oh thx mate first time using gdb :+1:
Program received signal SIGILL, Illegal instruction.
0x0000555555563573 in gpt_params::gpt_params (this=0x7fffffffc880) at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.h:23
23 struct gpt_params {
(gdb) bt
#0 0x0000555555563573 in gpt_params::gpt_params (this=0x7fffffffc880) at /home/barry/evo/ggllm.cpp-1/examples/falcon_common.h:23
#1 0x0000555555560767 in main (argc=9, argv=0x7fffffffdf58) at /home/barry/evo/ggllm.cpp-1/examples/falcon/falcon_main.cpp:52
(gdb) disas
Dump of assembler code for function _ZN10gpt_paramsC2Ev:
0x0000555555563540 <+0>: endbr64
0x0000555555563544 <+4>: push %rbp
0x0000555555563545 <+5>: push %rbx
0x0000555555563546 <+6>: mov %rdi,%rbx
0x0000555555563549 <+9>: sub $0x18,%rsp
0x000055555556354d <+13>: mov %fs:0x28,%rax
0x0000555555563556 <+22>: mov %rax,0x8(%rsp)
0x000055555556355b <+27>: xor %eax,%eax
0x000055555556355d <+29>: movl $0xffffffff,(%rdi)
0x0000555555563563 <+35>: call 0x555555566f40 <_Z22get_num_physical_coresv>
0x0000555555563568 <+40>: movq $0x200,0x10(%rbx)
0x0000555555563570 <+48>: mov %rsp,%rsi
=> 0x0000555555563573 <+51>: vpxor %xmm0,%xmm0,%xmm0
0x0000555555563577 <+55>: mov %eax,0x4(%rbx)
0x000055555556357a <+58>: lea 0xc8(%rbx),%rdi
0x0000555555563581 <+65>: xor %edx,%edx
0x0000555555563583 <+67>: movabs $0x200ffffffff,%rax
0x000055555556358d <+77>: mov %rax,0x8(%rbx)
0x0000555555563591 <+81>: lea 0x90(%rbx),%rax
0x0000555555563598 <+88>: mov %rax,0x60(%rbx)
0x000055555556359c <+92>: movabs $0x3f73333300000028,%rax
0x00005555555635a6 <+102>: mov %rax,0x98(%rbx)
0x00005555555635ad <+109>: movabs $0x3f8000003f800000,%rax
0x00005555555635b7 <+119>: mov %rax,0xa0(%rbx)
0x00005555555635be <+126>: movabs $0x3f8ccccd3f4ccccd,%rax
0x00005555555635c8 <+136>: mov %rax,0xa8(%rbx)
0x00005555555635cf <+143>: movabs $0x3dcccccd40a00000,%rax
0x00005555555635d9 <+153>: mov %rax,0xc0(%rbx)
0x00005555555635e0 <+160>: lea 0xd8(%rbx),%rax
0x00005555555635e7 <+167>: movq $0x0,0x18(%rbx)
0x00005555555635ef <+175>: movq $0x1,0x68(%rbx)
0x00005555555635f7 <+183>: movq $0x0,0x70(%rbx)
0x00005555555635ff <+191>: movq $0x0,0x78(%rbx)
going to spray some prints in the get num_physical_cores
see at what line it breaks exactly
Interesting, my first guess is that it's a kernel issue. some AVX support problem. Quick fix: remove the get_physical_cores() call from falcon_common.h line 23 and replace it with 1. You can always set the threads using "-t" Other option for linux to get the cores would be: cores = sysconf(_SC_NPROCESSORS_CONF); // will also need unistd.h
Other option: kernel upgrade. You can also try updating compilers to latest standards first.
If you run into more such issues beyond that point the compilation flags will need to be changed to not use those instruction sets. Normally that should work out of the box..
It looks like the compiler is being told to compile with AVX instructions even though your CPU does not support them. If you can pass -DLLAMA_AVX=OFF -DLLAMA_AVX2=OFF
to CMake I think it would work around this issue.
Built like this: rm -rf build && mkdir build && cd build && cmake -DGGML_CUBLAS=1 -DLLAMA_AVX=OFF -DLLAMA_AVX2=OFF .. && cmake --build . --config Release
Still core dumps on illegal instruction
I'm not sure what instructions your CPU supports, but you may also need one of -DLLAMA_FMA=OFF
or -DLLAMA_F16C=OFF
(or both). The output of cpuid
from the cpuid package would be helpful.
Yep, so you need at least -DLLAMA_AVX=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF -DLLAMA_F16C=OFF
in order to compile code that is compatible with your CPU.
I was under the impression that AVX is automatically switched on/off depending on what the cpu broadcasts as supported. Good to know that needs to be set manually
It core dumps when running without args and when running with the correct args as well.
System: Kernel Linux 5.15.0-75-generic x86_64 MATE 1.26.0 Intel® Xeon(R) CPU X5660 @ 2.80GHz × 24 llvmpipe (LLVM 15.0.6, 128 bits)
Branch: Main no cuda _home_barry_evo_ggllm.cpp_build_bin_falcon_main.1000.txt