intel / isa-l_crypto

Other
267 stars 80 forks source link

How to compile `isa-l_crypto` to use AVX instruction only? #81

Closed leiless closed 2 years ago

leiless commented 2 years ago

Hi, all. My CPU is i5-9400F (Intel® SSE4.1, Intel® SSE4.2, Intel® AVX2), which supports AVX and AVX2.

But I wanna build isa-l_crypto to use AVX only, not to use AVX2. How can I do that?

$ lscpu | grep -E "^(Flags|Model)"
Model:               158
Model name:          Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves arat umip md_clear arch_capabilities

https://www.intel.com/content/www/us/en/products/sku/190883/intel-core-i59400f-processor-9m-cache-up-to-4-10-ghz/specifications.html

gbtucker commented 2 years ago

You could modify the multi-binary dispatcher scripts to not test for it or run other versions instead. For example, remove the blocks of ~6 instructions labeled.

include/multibinary.asm: ;; Test for AVX2

But I don't suggest. Note that isa-l_crypto sticks to "light" AVX2 instructions.

leiless commented 2 years ago

Note that isa-l_crypto sticks to "light" AVX2 instructions.

Thanks! And what does this mean?

leiless commented 2 years ago

You could modify the multi-binary dispatcher scripts to not test for it or run other versions instead. For example, remove the blocks of ~6 instructions labeled.

include/multibinary.asm: ;; Test for AVX2

But I don't suggest. Note that isa-l_crypto sticks to "light" AVX2 instructions.

diff --git a/include/reg_sizes.asm b/include/reg_sizes.asm
index 717dd05..f1fbcae 100644
--- a/include/reg_sizes.asm
+++ b/include/reg_sizes.asm
@@ -44,9 +44,9 @@
 %define FLAG_CPUID1_ECX_AESNI   (1<<25)
 %define FLAG_CPUID1_ECX_OSXSAVE (1<<27)
 %define FLAG_CPUID1_ECX_AVX     (1<<28)
-%define FLAG_CPUID1_EBX_AVX2    (1<<5)
+%define FLAG_CPUID1_EBX_AVX2    (0)

-%define FLAG_CPUID7_EBX_AVX2           (1<<5)
+%define FLAG_CPUID7_EBX_AVX2           (0)
 %define FLAG_CPUID7_EBX_AVX512F        (1<<16)
 %define FLAG_CPUID7_EBX_AVX512DQ       (1<<17)
 %define FLAG_CPUID7_EBX_AVX512IFMA     (1<<21)

FYI, This seems also works for me.