intel / isa-l

Intelligent Storage Acceleration Library
Other
942 stars 299 forks source link

AVX512 detection failed when cpu supports AVX512 #291

Open swhzzh opened 1 month ago

swhzzh commented 1 month ago

OS: debian 9 GCC: 6.3 NASM: 2.12.01 CPU: Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz ISA-L: 2.31

I have confirmed that my cpu supports AVX512 through https://ark.intel.com/content/www/us/en/ark/products/215269/intel-xeon-silver-4314-processor-24m-cache-2-40-ghz.html. However the AVX512 detection in building isa-l failed. I tried to run the detection code myself and got the following output: $echo vinserti32x8 zmm0, ymm1, 1\; > tst.asm && nasm -f elf64 tst.asm && echo pass tst.asm:1: error: invalid combination of opcode and operands.

I want to know why the detection failed and if there are some operations to the OS kernel I need to perform to enable AVX512. Thanks for answering!

pablodelara commented 1 month ago

Hi @swhzzh. Your NASM version is way too old. You should install NASM 2.13.03 at least.

swhzzh commented 1 month ago

Hi @swhzzh. Your NASM version is way too old. You should install NASM 2.13.03 at least.

I install NASM 2.16 and then passed the check, thanks! Besides, i want to know which xor_gen method is called at runtime, anyway to do that?

swhzzh commented 1 month ago

@pablodelara Hi, i benchmark xor_gen 10 +1 performance in different sizes. I can see that:

when CPU L2 Cache(1280K) can hold all data units(the test length is less than 128K per unit), the xor_gen performance when enable AVX512 is much better than not enable; however, for larger sizes, the xor_gen performance when enable AVX512 is worse than not enable.

Can you tell me why? Thanks!