antimatter15 / alpaca.cpp

Locally run an Instruction-Tuned Chat-Style LLM
MIT License
10.25k stars 910 forks source link

Problem with Macmini 2011 and 10.13 - "Illegal instruction:4" #190

Open fbrzvnrnd2 opened 1 year ago

fbrzvnrnd2 commented 1 year ago

Hi, I tried the chat on a Macbook 2014 & Mojave and it works, slowly but works. When I tried it on a Macmini 2011 & 10.13 (to test the speed with more memory), the build worked, but when I started the chat I got an "Illegal instruction:4" There is something I could do?

calphool commented 1 year ago

I get the exact same error. I tracked down the exact line that's causing the problem. It's in ggml.c, and the line is:

const float f = table_f32_f16[i] = GGML_COMPUTE_FP16_TO_FP32(ii);

Apparently GGML_COMPUTE_FP16_TO_FP32 gets translated into: _cvtsh_ss

For whatever reason, _cvtsh_ss causes "Illegal Instruction 4". I see code that's turning on this DEFINE if F16C is defined, which gets passed into the compiler based on the fact that we're running on Darwin (MacOSX) when we're compiling (it's in the Makefile). When I remove this option, the code runs further, but I get a different error somewhere else.

Anybody got any ideas what's going on here? I'm running 10.15.7 (Catalina), and I'm on Xcode 12.4. My processor is an Intel I7, so I should have F16C support. Any help would be much appreciated.

themanyone commented 1 year ago

On intel i7 sandylake, it's translating GGML_COMPUTE_FP16_TO_FP32 into _mm256_cvtph_ps, which is giving a target specific option mismatch at compile time in f16cintrin.h:52:1. f16cintrin.h is part of the gcc compiler headers.

I can get ggml to compile by turning off -std=c11 and -mavx (or by not using -mtune=sandylake which includes it) but then chat is very-slow.

(As an aside, I did try the pre-compiled version found in alpaca-linux.zip, which gives a similar error at runtime. As does the gpt4all-lora-quantized-linux-x86 from the alpaca.cpp project on github).