Open AngelaZhang3913 opened 1 month ago
You should be able to fix this by compiling with GGML_NO_LLAMAFILE
.
cc: @jart
Could you use objdump -d llama-cli
to copy and paste me the line of assembly code at the faulting address?
It seems to have worked with GGML_NO_LLAMAFILE. Thanks a lot!
Same issue here with Ryzen 7 7735HS
cpuinfo:
CPUID is present
CPU Info for type #0:
------------------
arch : x86
purpose : general
vendor_str : `AuthenticAMD'
vendor id : 1
brand_str : `AMD Ryzen 7 7735HS with Radeon Graphics'
family : 15 (0Fh)
model : 4 (04h)
stepping : 1 (01h)
ext_family : 25 (19h)
ext_model : 68 (44h)
num_cores : 8
num_logical: 16
tot_logical: 16
affi_mask : 0x0000FFFF
L1 D cache : 32 KB
L1 I cache : 32 KB
L2 cache : 512 KB
L3 cache : 16384 KB
L4 cache : -1 KB
L1D assoc. : 8-way
L1I assoc. : 8-way
L2 assoc. : 8-way
L3 assoc. : 16-way
L4 assoc. : -1-way
L1D line sz: 64 bytes
L1I line sz: 64 bytes
L2 line sz : 64 bytes
L3 line sz : 64 bytes
L4 line sz : -1 bytes
L1D inst. : 8
L1I inst. : 8
L2 inst. : 8
L3 inst. : 1
L4 inst. : 0
SSE units : 256 bits (authoritative)
code name : `Ryzen 7 (Rembrandt)'
features : fpu vme de pse tsc msr pae mce cx8 apic mtrr sep pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht pni pclmul monitor ssse3 cx16 sse4_1 sse4_2 syscall movbe popcnt aes xsave osxsave avx mmxext nx fxsr_opt rdtscp lm lahf_lm cmp_legacy svm abm misalignsse sse4a 3dnowprefetch osvw ibs skinit wdt ts ttp tm_amd hwpstate constant_tsc fma3 f16c rdrand cpb aperfmperf avx2 bmi1 bmi2 sha_ni rdseed adx
The program tries to execute a vmovupd %zmm0,0x13(%rax)
that is not supported by my CPU model (it's AVX512F)
I think it's better to not crash with a Illegal Instruction
error which the CPU doesn't support any instruction which llamafile trying to use. In @AngelaZhang3913 's case, the mentioned cpu model is very old which doesn't seem to support either AVX2 or AVX 512 according to Intel.
Was this binary compiled on the host system? Chances are the binary was built somewhere else on a machine that has AVX512F and then copied over to the old computer.
I used the binary packaged by the arch OS: https://archlinux.org/packages/extra/x86_64/ollama/
I'm running into this when building the flake.nix on a remote builder. It is an impurity that shouldn't exist in the build.
I fixed this when using Nix, by flipping GGML_NATIVE_DEFAULT
in CMake. I believe everything is native by default now, but the logic is multiple levels deep between the Nix expression and the Makefiles and their _DEFAULT
values. So it's hard to read. It's possible something was flipped by accident due to the rename of LLAMA_NATIVE -> GGML_NATIVE in the past months or so.
The nix expression in .devops/nix/package
(why put this stuff in a . directory?) says (cmakeBool "GGML_NATIVE" false)
, yet despite it setting GGML_NATIVE
to false, I still had to flip GGML_NATIVE_DEFAULT
, implying a logic error somewhere in the build scripts.
diff --git a/ggml/CMakeLists.txt b/ggml/CMakeLists.txt
index 7fe1661b..363413a9 100644
--- a/ggml/CMakeLists.txt
+++ b/ggml/CMakeLists.txt
@@ -53,7 +53,7 @@ endif()
if (CMAKE_CROSSCOMPILING)
set(GGML_NATIVE_DEFAULT OFF)
else()
- set(GGML_NATIVE_DEFAULT ON)
+ set(GGML_NATIVE_DEFAULT OFF)
endif()
# general
I'm also getting this error out-of-the-box.
Using NixOS on an i5-3230M
(supports AVX but not AVX2), trying to run the flake with nix run
.
What happened?
I'm trying to run llama-server using
./llama-server -m models/codellama-7b.Q4_K_M.gguf -c 2048
after building it. I'm getting an Illegal Instruction error message.The illegal instruction is
0x000000000045e3bc in void (anonymous namespace)::tinyBLAS<16, float __vector(16), float __vector(16), unsigned short, float, float>::gemm<5, 2>(long, long, long, long) ()
It seems to be failing in the AVX512F instruction set. The instruction seems to be in the GEMM function which seems to be a function for matrix multiplication operation in the tinyblas library
Additional context: I've tried doing
export CFLAGS="-march=native -mtune=native -mno-avx512f"
andexport CXXFLAGS="$CFLAGS"
already, but it didn't work.Another person commented that they were having the same issue as me. I was recommended to make a bug report for this.
Here is my lscpu output Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Stepping: 1 CPU MHz: 2016.174 CPU max MHz: 3000.0000 CPU min MHz: 1200.0000 BogoMIPS: 4199.74 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d
Name and Version
$ ./llama-server --version version: 3384 (4e24cffd) built with gcc (GCC) 8.3.0 for x86_64-pc-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output