homenc / HElib

HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizations for efficient homomorphic evaluation, focusing on effective use of ciphertext packing techniques and on the Gentry-Halevi-Smart optimizations.
https://homenc.github.io/HElib
Other
3.11k stars 760 forks source link

PGFFT: Require __FMA__ directive for FMA operations #428

Closed dubek closed 3 years ago

dubek commented 3 years ago

When compiling in a VirtualBox VM, AVX2 instructions are supported but FMA instructions are not. In such case, compiling PGFFT.cpp fails with:

/usr/lib/gcc/x86_64-linux-gnu/9/include/fmaintrin.h:239:1: error: inlining failed in call to always_inline '__m256d_mm256_fmaddsub_pd(__m256d, __m256d, __m256d)': target specific option mismatch
  239 | _mm256_fmaddsub_pd (__m256d __A, __m256d __B, __m256d __C)
      | ^~~~~~~~~~~~~~~~~~
/home/user/work/HElib-2.0.0/src/PGFFT.cpp:330:51: note: called from here
  330 | { return _mm256_fmaddsub_pd(a.data, b.data, c.data); }
      |

This fix narrows the detection so that only when both __AVX2__ and __FMA__ are defined then the special FMA implementation will be used.

Note that I didn't replace HAVE_AVX2 to HAVE_AVX2_AND_FMA throughout the file.

Fix #426

Signed-off-by: Dov Murik dov.murik1@il.ibm.com

g2flyer commented 3 years ago

Works for my setup but might not work for a VirtualBox host having AVX but not AVX2? Or put differently, wouldn't the same treatment applied to HAVE_AVX2 also be better applied as well to HAVE_AVX?

dubek commented 3 years ago

Thanks @g2flyer for verifying.

I think that all the FMA operations are called inside a #ifdef HAVE_AVX2 section; that was my idea for a minimal change (but maybe I missed something). You might be right. I'm waiting to hear the maintainers' opinion.

victorshoup commented 3 years ago

It looks like this should be OK. From what I read online, it seems like this it should have been OK as it was... see https://stackoverflow.com/questions/16348909/how-do-i-know-if-i-can-compile-with-fma-instruction-sets

Indeed, according to that link, with the proposed patch, it looks like it will not detect FMA in Windows C++. But I don't think we are even targeting that platform.

This also raises the question as to whether I should also patch my copy of PGFFT. I'll think about it.

g2flyer commented 3 years ago

Thanks @g2flyer for verifying.

I think that all the FMA operations are called inside a #ifdef HAVE_AVX2 section; that was my idea for a minimal change (but maybe I missed something). You might be right. I'm waiting to hear the maintainers' opinion.

@dubek sorry, you are correct. I misread code and wrongly thought the FMA code was conditioned on USE_PD4.

faberga commented 3 years ago

The changes proposed in this PR have passed all regression tests on the Intel Haswell and Ivy Bridge platforms with bare metal Linux (no virtualization environments such as VirtualBox and/or VMWare), and also on Intel Haswell Mac OS. As such, we can only assert that the proposed changes do not cause any regression on the following build&test platforms:

Linux on Intel Haswell and Ivy Bridge:

MacOS on Intel Haswell:

See https://github.com/homenc/HElib/blob/master/INSTALL.md for a full list of the dependencies for each build tested.

jlhcrawford commented 3 years ago

The proposed changes do not cause any regression on Mainframe Z14 Bare Metal Linux - Ubuntu 18.04, Compiler GCC.

faberga commented 3 years ago

Successful tests reported on: