Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.
https://llamafile.ai
Other
20.57k stars 1.04k forks source link

Bug: Shared memory not working, results in Segfault #611

Open abishekmuthian opened 2 weeks ago

abishekmuthian commented 2 weeks ago

Contact Details

abishek.muthian@protonmail.com

What happened?

Thank you Justine and team for the llamafile.

I have 16GB VRAM and 96GB RAM in my system (Fedora 41).

When I run gemma-2-27b-it.Q6_K.llamafile with -ngl 1 I get Segmentation Fault.

The model works fine when I don't use GPU offloading. I use the same model in Ollama all the time, where the VRAM and RAM are shared resulting in better performance. I'm told llama.cpp uses system ram when we try to run model that's more than Vram, Doesn't llamafile do the same?

Version

llamafile v0.8.15

What operating system are you seeing the problem on?

Linux

Relevant log output

[abishek@MacubexROGLinux llamafile]$ ./gemma-2-27b-it.Q6_K.llamafile -ngl 1

██╗     ██╗      █████╗ ███╗   ███╗ █████╗ ███████╗██╗██╗     ███████╗
██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║     ██╔════╝
██║     ██║     ███████║██╔████╔██║███████║█████╗  ██║██║     █████╗
██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝  ██║██║     ██╔══╝
███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║██║     ██║███████╗███████╗
╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝
 launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x8 on MacubexROGLinux pid 157694 tid 157701
  ./gemma-2-27b-it.Q6_K.llamafile
  No error information
  Linux Cosmopolitan 3.9.4 MODE=x86_64; #1 SMP PREEMPT_DYNAMIC Tue Oct 22 20:11:15 UTC 2024 MacubexROGLinux 6.11.5-300.fc41.x86_64

RAX 0000000000000000 RBX 00007f645c196240 RDI 0000000000000000
RCX 0000000000000003 RDX 0000000000000000 RSI 00007f69995fec00
RBP 00007f645c1958a0 RSP 00007f645c1958a0 RIP 0000000000545096
 R8 0000000000000002  R9 00007f69995fd138 R10 00007f69995fec00
R11 0000000000000070 R12 00007f645c197540 R13 00007f645c197210
R14 00007f645c196258 R15 00007f645c1958b0
TLS 00007f5f10de4b00

XMM0  00007f69995fef9000007f69995fef90 XMM8  00000000000000000000000000000000
XMM1  00000000000000000000000000000000 XMM9  00000000000000000000000000000000
XMM2  00007f644880000000007f5f0f221000 XMM10 00000000000000000000000000000000
XMM3  00007f645400000000007f6450000000 XMM11 00000000000000000000000000000000
XMM4  00007ffd816f400000007f69996e4000 XMM12 00000000000000000000000000000000
XMM5  00007f645c1a800000007f645c183000 XMM13 00000000000000000000000000000000
XMM6  203d2032585641207c2031203d20494e XMM14 00000000000000000000000000000000
XMM7  4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000

cosmoaddr2line /home/abishek/llamafile/gemma-2-27b-it.Q6_K.llamafile 545096 43390a 42d23a 4b77a5 8cd994 8ddf94 9359e7

0x0000000000545096: ?? ??:0
0x000000000043390a: ?? ??:0
0x000000000042d23a: ?? ??:0
0x00000000004b77a5: ?? ??:0
0x00000000008cd994: ?? ??:0
0x00000000008ddf94: ?? ??:0
0x00000000009359e7: ?? ??:0

000000400000-000000a801e0 r-x-- 6656kb
000000a81000-0000031dd000 rw--- 39mb
0006fe000000-0006fe001000 rw-pa 4096b
7f5f0de00000-7f5f0e000000 rw-pa 2048kb
7f5f10800000-7f5f14800000 rw-pa 64mb
7f5f149e0000-7f6448651e80 r--s- 21gb
7f645be00000-7f645c000000 rw-pa 2048kb
7f645c184000-7f645c185000 ---pa 4096b
7f645c185000-7f645c198000 rw-pa 76kb
7f645c1f2000-7f699947389b r--s- 21gb
7f6999475000-7f6999475fc0 rw-pa 4032b
7f6999486000-7f69995ac418 rw-pa 1177kb
7f69995ad000-7f69996de000 rw-pa 1220kb
7ffd80f16000-7ffd81716000 rw--- 8192kb
# 44'974'731'264 bytes in 14 mappings

./gemma-2-27b-it.Q6_K.llamafile -m gemma-2-27b-it.Q6_K.gguf -c 8192 -ngl 1 
Segmentation fault (core dumped)
OEvgeny commented 5 days ago

Same for me on NixOS:

> ./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10
██╗     ██╗      █████╗ ███╗   ███╗ █████╗ ███████╗██╗██╗     ███████╗
██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║     ██╔════╝
██║     ██║     ███████║██╔████╔██║███████║█████╗  ██║██║     █████╗
██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝  ██║██║     ██╔══╝
███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║██║     ██║███████╗███████╗
╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝
 launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x8 on homelab pid 3750201 tid 3750215
  ./llamafile-0.8.16
  No error information
  Linux Cosmopolitan 3.9.6 MODE=x86_64; #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024 homelab 6.6.52

RAX 0000000000000000 RBX 00007f9c6622e240 RDI 0000000000000000
RCX 0000000000000003 RDX 0000000000000000 RSI 00007f9d4add2c00
RBP 00007f9c6622d8a0 RSP 00007f9c6622d8a0 RIP 0000000000545066
 R8 0000000000000002  R9 00007f9d4add10d8 R10 00007f9d4add2c00
R11 0000000000000040 R12 00007f9c6622f540 R13 00007f9c6622f210
R14 00007f9c6622e258 R15 00007f9c6622d8b0
TLS 00007f9b1c3fad00

XMM0  00007f9d4add2f9000007f9d4add2f90 XMM8  00000000a33cd03000000000a33cc6b0
XMM1  00000000000000000000000000000000 XMM9  00000000000000000000000000000000
XMM2  00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3  00007f9cb022ba500000002c04000000 XMM11 00000000000000000000000000000000
XMM4  000000000000000000007f9d41563b20 XMM12 00000000000000000000000000000000
XMM5  7c2031203d20323135585641207c2031 XMM13 00000000000000000000000000000000
XMM6  203d2032585641207c2030203d20494e XMM14 00000000000000000000000000000000
XMM7  4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000

cosmoaddr2line /home/enol/sd-docker/llamafile/llamafile-0.8.16 545066 43390a 42d23a 4b7775 8ce154 8de754 9369e7

note: can't find addr2line on path or in ADDR2LINE
7f9c6622a7c0 545066 llama_n_ctx+6
7f9c6622d8a0 43390a llama_server_context::load_model(gpt_params const&)+396
7f9c6622d970 42d23a server_cli(int, char**)+3318
7f9c6622ff50 4b7775 server_thread(void*)+53
7f9c6622ff60 8ce154 PosixThread+132
7f9c6622ffb0 8de754 LinuxThreadEntry+36
7f9c6622ffd0 9369e7 sys_clone_linux+39

000000400000-000000a811e0 r-x-- 6660kb
000000a82000-0000031de000 rw--- 39mb
0006fe000000-0006fe001000 rw-pa 4096b
7f95528df000-7f9a96fffe00 r--s- 21gb
7f9b04000000-7f9b04200000 rw-pa 2048kb
7f9b04400000-7f9b06600000 rw-pa 34mb
7f9b1c000000-7f9b1cc00000 rw-pa 12mb
7f9b5fe00000-7f9b60000000 rw-pa 2048kb
7f9c6621c000-7f9c6621d000 ---pa 4096b
7f9c6621d000-7f9c66230000 rw-pa 76kb
7f9c662d5000-7f9c662d5fc0 rw-pa 4032b
7f9d415c4000-7f9d416ea400 rw-pa 1177kb
7f9d416eb000-7f9d4ace0f66 r--s- 150mb
7f9d4ace1000-7f9d4ade2000 rw-pa 1028kb
7fff8d694000-7fff8de94000 rw--- 8192kb
# 22'891'671'552 bytes in 15 mappings

./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10 
Segmentation fault (core dumped)

I have everything needed for llamafile to compile and run on AMD set using environment variables.

Update: just checked and the latest working version for me is llamafile-0.8.13, everything else results in segfault.