algebraic-solving / msolve

Library for Polynomial System Solving through Algebraic Methods
https://msolve.lip6.fr
GNU General Public License v2.0
96 stars 22 forks source link

Detect AVX2 support at runtime #71

Open jamesjer opened 1 year ago

jamesjer commented 1 year ago

Linux distributions must build for the lowest common denominator CPU. For the Fedora Linux distribution, the original x86_64 is still supported, meaning we cannot build msolve with AVX2 support. Would you consider detecting AVX2 support at runtime instead of at compile time?

One way that could be done is to add this code somewhere in src/neogb:

/* check for AVX2 availability */
#ifdef __amd64__
#include <cpuid.h>
int have_avx2;

static void __attribute__((constructor))
set_avx2_flag(void)
{
  unsigned int eax, ebx, ecx, edx;
  have_avx2 = __get_cpuid(7, &eax, &ebx, &ecx, &edx) && (ebx & bit_AVX2) != 0;
}
#endif

That works for gcc and clang. If you want to support other compilers, the code might get a little more complex. With have_avx2 available, then code like this:

#ifdef HAVE_AVX2
  foo;
#else
  bar;
#endif

would be transformed into this:

#ifdef __amd64__
  if (have_avx2) {
    foo;
  } else
#endif
  {
    bar;
  }

On x86_64 platforms, then -mavx could be passed to the compiler always, since the AVX2 code is not executed if __get_cpuid indicates the CPU doesn't support AVX2. That would let you throw away most or all of several files in the m4 directory.

I can open a PR if you like the idea.

dimpase commented 2 months ago

@jamesjer - is it hard to have 2 flavours of the package on Fedora, one with AVX2 support, and one without? And when the package gets installed on a box, dnf knows about the AVX2 support and can pull the correct version?

jamesjer commented 2 months ago

We have had some discussions about doing that on the Fedora mailing list. So far the idea has failed to gain traction. It is technically doable, but the distribution does not currently support that approach.

dimpase commented 2 months ago

I thought something like this is done for openblas, but perhaps I'm mixing this up.

jamesjer commented 2 months ago

I took a look at the openblas spec file. I don't see anything of the sort happening. In fact, I took a peek inside the openblas 0.3.26 tarball, and they're doing runtime CPU detection with cpuid, just as I'm proposing here. :-) Is that approach interesting at all?

dimpase commented 2 months ago

I heard that SIMDe is a viable approach to emulate AVX etc if these are not available. Although I've no idea how this can play out on the level of binaries. https://github.com/simd-everywhere/simde