kherud / java-llama.cpp

Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++
MIT License
301 stars 32 forks source link

Prebuilt x86 linux libs use EVEX encoding for AVX instructions, causing SIGILL #25

Closed AutonomicPerfectionist closed 6 months ago

AutonomicPerfectionist commented 1 year ago

Attempting to use the pre-packaged libjllama.so on x86_64 Linux on CPUs that don't support AVX512 causes a SIGILL crash. I dumped the object code and found that an AVX2 instruction was encoded using the EVEX encoding scheme, which was added with the AVX512 extension. I don't have any CPUs that support AVX512 so I can't truly verify that this is the cause of the SIGILL crash, but it seems pretty likely. I think I've pinpointed the cause of this change to https://github.com/ggerganov/llama.cpp/pull/3273, so we probably need to add LLAMA_NATIVE=OFF for linux x86 now

kherud commented 1 year ago

Thank you for the issue! I'm a bit hesitant to set LLAMA_NATIVE to OFF. I think this disables both AVX and AVX2, which would be a significant hit to performance, even though both are widely supported. In build-args.cmake the cmake arguments are processed (copied from llama.cpp). AVX_512 should be off by default and also not be affected by LLAMA_NATIVE. It smells to me a bit like AVX512 is not the root cause of the problem. But I'll try disabling it and see if the problem still occurs.

AutonomicPerfectionist commented 1 year ago

There are options to enable AVX and AVX2 separately. While LLAMA_AVX512 is indeed disabled, by enabling LLAMA_NATIVE the flag -march=native is a passed, which when run on a CPU with AVX512 will generate code using the newer EVEX encoding scheme for the legacy AVX instructions, which is not supported on non-AVX512 CPUs. At least that's how I understand it, it's very confusing

kherud commented 6 months ago

I just released version 3.0 which upgraded to the newest llama.cpp version. There was a huge amount of change, so I'm not sure if this issue still applies. To reduce old issue, I'll close this one for now, but feel free to re-open if the problem still occurs.