intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.1k stars 206 forks source link

AVX-512 instructions on non-AVX-512 machine #675

Closed bearn01d closed 10 months ago

bearn01d commented 10 months ago

Quantization of Mistral fails with an illegal instruction for me. gdb disass shows the usage of AVX-512 instructions like vcvtusi2sd. I have built from source as described here: https://github.com/intel/intel-extension-for-transformers/tree/ef2c4793c75d4950bfa36f1c1b7c2458bb3bb95a/intel_extension_for_transformers/llm/runtime/graph#how-to-use-python-script.

Edit: I have the confirmation that with an AMD Ryzen 4750U, the problem does not occur (Ubuntu 20.04, gcc-10), however the model output is garbage (which however has a different reason probably). Besides, using the i7-1365U, Fedora 39 with gcc-13 in Docker results in the same problem. So I'd be suspicious that some of the Intel-specific optimizations might be the root cause.

The setup is as follows:

DDEle commented 10 months ago

Thanks for your feedback and the detailed report. You are right, we accidently apply GCC avx512 target options to the whole compile unit. We have located the problem, and it will be fixed in a few days.

bearn01d commented 10 months ago

Thank you for your feedback!