Open abishekmuthian opened 2 weeks ago
Same for me on NixOS:
> ./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10
██╗ ██╗ █████╗ ███╗ ███╗ █████╗ ███████╗██╗██╗ ███████╗
██║ ██║ ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║ ██╔════╝
██║ ██║ ███████║██╔████╔██║███████║█████╗ ██║██║ █████╗
██║ ██║ ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝ ██║██║ ██╔══╝
███████╗███████╗██║ ██║██║ ╚═╝ ██║██║ ██║██║ ██║███████╗███████╗
╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝
launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x8 on homelab pid 3750201 tid 3750215
./llamafile-0.8.16
No error information
Linux Cosmopolitan 3.9.6 MODE=x86_64; #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024 homelab 6.6.52
RAX 0000000000000000 RBX 00007f9c6622e240 RDI 0000000000000000
RCX 0000000000000003 RDX 0000000000000000 RSI 00007f9d4add2c00
RBP 00007f9c6622d8a0 RSP 00007f9c6622d8a0 RIP 0000000000545066
R8 0000000000000002 R9 00007f9d4add10d8 R10 00007f9d4add2c00
R11 0000000000000040 R12 00007f9c6622f540 R13 00007f9c6622f210
R14 00007f9c6622e258 R15 00007f9c6622d8b0
TLS 00007f9b1c3fad00
XMM0 00007f9d4add2f9000007f9d4add2f90 XMM8 00000000a33cd03000000000a33cc6b0
XMM1 00000000000000000000000000000000 XMM9 00000000000000000000000000000000
XMM2 00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3 00007f9cb022ba500000002c04000000 XMM11 00000000000000000000000000000000
XMM4 000000000000000000007f9d41563b20 XMM12 00000000000000000000000000000000
XMM5 7c2031203d20323135585641207c2031 XMM13 00000000000000000000000000000000
XMM6 203d2032585641207c2030203d20494e XMM14 00000000000000000000000000000000
XMM7 4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000
cosmoaddr2line /home/enol/sd-docker/llamafile/llamafile-0.8.16 545066 43390a 42d23a 4b7775 8ce154 8de754 9369e7
note: can't find addr2line on path or in ADDR2LINE
7f9c6622a7c0 545066 llama_n_ctx+6
7f9c6622d8a0 43390a llama_server_context::load_model(gpt_params const&)+396
7f9c6622d970 42d23a server_cli(int, char**)+3318
7f9c6622ff50 4b7775 server_thread(void*)+53
7f9c6622ff60 8ce154 PosixThread+132
7f9c6622ffb0 8de754 LinuxThreadEntry+36
7f9c6622ffd0 9369e7 sys_clone_linux+39
000000400000-000000a811e0 r-x-- 6660kb
000000a82000-0000031de000 rw--- 39mb
0006fe000000-0006fe001000 rw-pa 4096b
7f95528df000-7f9a96fffe00 r--s- 21gb
7f9b04000000-7f9b04200000 rw-pa 2048kb
7f9b04400000-7f9b06600000 rw-pa 34mb
7f9b1c000000-7f9b1cc00000 rw-pa 12mb
7f9b5fe00000-7f9b60000000 rw-pa 2048kb
7f9c6621c000-7f9c6621d000 ---pa 4096b
7f9c6621d000-7f9c66230000 rw-pa 76kb
7f9c662d5000-7f9c662d5fc0 rw-pa 4032b
7f9d415c4000-7f9d416ea400 rw-pa 1177kb
7f9d416eb000-7f9d4ace0f66 r--s- 150mb
7f9d4ace1000-7f9d4ade2000 rw-pa 1028kb
7fff8d694000-7fff8de94000 rw--- 8192kb
# 22'891'671'552 bytes in 15 mappings
./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10
Segmentation fault (core dumped)
I have everything needed for llamafile to compile and run on AMD set using environment variables.
Update: just checked and the latest working version for me is llamafile-0.8.13, everything else results in segfault.
Contact Details
abishek.muthian@protonmail.com
What happened?
Thank you Justine and team for the llamafile.
I have 16GB VRAM and 96GB RAM in my system (Fedora 41).
When I run
gemma-2-27b-it.Q6_K.llamafile
with-ngl 1
I get Segmentation Fault.The model works fine when I don't use GPU offloading. I use the same model in Ollama all the time, where the VRAM and RAM are shared resulting in better performance. I'm told llama.cpp uses system ram when we try to run model that's more than Vram, Doesn't llamafile do the same?
Version
llamafile v0.8.15
What operating system are you seeing the problem on?
Linux
Relevant log output