RubyCrypto / x25519

Public key cryptography library for Ruby providing the X25519 Diffie-Hellman function
https://cr.yp.to/ecdh.html
Other
46 stars 10 forks source link

[v1.0.9] Cannot compile gem on intel ( > 4th gen) on gcc < 4.9.0 #27

Closed alx75 closed 1 year ago

alx75 commented 2 years ago

The problem comes from the fact the x25519 is supported by the cpu because I use an intel > 4th gen [1] so x25519_precomputed is going to be compiled with arch haswell since this https://github.com/RubyCrypto/x25519/pull/25 .

However on gcc < 4.9.0 my arch is detected as core-avx2 and do not work with "-march haswell" ( error: bad value (haswell) for -march= switch )

OS: CentOS Linux release 7.9.2009 GCC: gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (version shipped with the os) [1] cpu: Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz

Obvious workaround is to update to gcc >= 4.9.0 :) or use x25519 v1.0.8 however that would be nice if this gem could support compiling on gcc < 4.9.0 for intel 4th gen and above.

tarcieri commented 2 years ago

I'm not sure how to support this other than introspecting the compiler's available march flags which sounds like a fraught endeavor

alx75 commented 2 years ago

Thanks for feedback. I'm not C developer so I'm not really familiar with c compiler options.

From what I understand you decided to compile with haswell instruction set to avoid failure with AVX-512. Isn't it possible to pass flag to the compiler not to use these instructions that was causing the issue ?

Another drawback of setting the arch to haswell means we won't have the ADCX instruction for 6th gen. It's maybe not a big deal though but it's not optimized anymore for 6th.

Maybe we could fallback to ref10 if gcc < 4.9.0 with a warning ?

If there is no way to make this work on every arch and gcc version maybe it's worth mentioning in the README that v1.0.9 expect gcc version >= 4.9.0.

tarcieri commented 2 years ago

From what I understand you decided to compile with haswell instruction set to avoid failure with AVX-512.

That's not quite what the problem was. See #22.

This crate leverages runtime feature detection, however prior to #25 the resulting binaries contained codepaths outside of the ones gated on runtime CPU feature detection which depended on unavailable CPU features and therefore caused the resulting binaries to be non-relocatable.

While that specific instance involved memcpy leveraging AVX-512, it could be any CPU feature available on the CPU on which the binary is compiled which isn't available on the CPU it's relocated to.

If there is no way to make this work on every arch and gcc version maybe it's worth mentioning in the README that v1.0.9 expect gcc version >= 4.9.0.

I'd be fine with a PR adding that documentation.