samtools / htscodecs

Custom compression for CRAM and others.
Other
30 stars 18 forks source link

Add cpuid checks behind an HAVE_CPUID check. #116

Closed jkbonfield closed 6 months ago

jkbonfield commented 7 months ago

Fixes #115.

Testing compiler versions is tricky, especially given some vendors wrap up standard compilers as their own, but change version numbers (Apple, Intel) and compilers also often masquerade as another (eg setting GNUC).

Instead we just check if it links. Note there's no guard around this here for x86-64, but that's done in the code itself. (In theory we may have cpuid on another architecture that we wish to use at some point.)

jkbonfield commented 7 months ago

Problems still to consider.

How does this work from htslib, which although having this as a submodule doesn't run the configure script ind instead builds the source direct.

  1. Htslib's ./configure - it can just copy this logic and add HAVE_CPUID into its own config.h
  2. Htslib's Makefile (configureless) could just copy the existing ifdef magic to the config.h stub it creates, so it replicates current behaviour (with the same limitations on portability). Or we just define it to be true and accept if it fails then don't do that - run configure!

Htslib builds from source have htscodecs as a submodule, so both can update in sync. Htslib can also link against an external htscodecs library, but the configure issue is then solved there already, so it's not a problem.

jmarshall commented 6 months ago

I don't have an x86 machine handy to check on, but could this just be done with the built-in generic tests? e.g.

AC_CHECK_DECLS([__get_cpuid_max, __cpuid_count], [], [], [[#include <cpuid.h>]])
jkbonfield commented 6 months ago

Good thought. Possibly. It'd need validating whether it also solves the problem on the Mac, but given it was a compiler warning about missing prototype as well as a link error it seems likely. I'll try it out.

jkbonfield commented 6 months ago

So... pros and cons. The configure.ac is simpler: just 1 line as you suggest. However the code that uses it is now ugly, made more so by the 0/1 nature rather than defined/not-defined as used elsewhere in autoconf.

#if defined(__x86_64__) && \
    defined(HAVE_DECL___CPUID_COUNT)   && HAVE_DECL___CPUID_COUNT && \
    defined(HAVE_DECL___GET_CPUID_MAX) && HAVE_DECL___GET_CPUID_MAX
...

Plus these will need to be added to the htslib Makefile too for when not using configure.

Thoughts?

jkbonfield commented 6 months ago

I added those changes as a separate commit anyway, so we can either squash them together which means we get the new mechanism, or drop that commit and keep my previous method. Both appear to work fine under linux. I'll validate MinGW

jmarshall commented 6 months ago

Thoughts?

If you fill in the actions-if-found and/or actions-if-not-found arguments, you can have your own simpler choice of macros to be defined instead of getting the default actions.

jkbonfield commented 6 months ago

I did initially try as you suggest, but the if-found action is called for each in turn, so it ends up as an OR instead of an AND operation which is what I need. You can do it with two queries, to create two simpler looking macros, which also means it can be turned into a if-defined query rather than "defined and !zero" check, but then you're just doubling up code somewhere else so it doesn't feel like a big win and is non-standard to boot.

Anyway, given I cannot validate any of this as fixing the initial issue, I'll wait for feedback from @ryandesign.

daviesrob commented 6 months ago

I've tested this and can confirm that it fixes the build on MacOS X 10.7 (Lion) with a contemporary xcode, all running in a UTM virtual machine.

configure reports:

checking whether __get_cpuid_max is declared... no
checking whether __cpuid_count is declared... no
checking C compiler flags needed for sse4.1... unsupported
checking C compiler flags needed for avx2... unsupported
checking C compiler flags needed for avx512f... unsupported

which I guess is all we can expect with the missing declarations.

jkbonfield commented 6 months ago

What's the plan for htslib?

We can put in a similar autoconf rule easily enough as it's minimal code to add. There's the Makefile though for when configure isn't used. If we do nothing it should continue to build, just defaulting without optimised implementations. I'm not sure I like that (but frankly I've never liked our ability to build without first running configure either as it's just a needless pitfall). Instead I would advise hard coding the macros to be "on", so the default non-configured is just to assume this stuff works. That's a reasonable assumption which should hold true everywhere given it still has an x86-64 CPU check anyway.

daviesrob commented 6 months ago

I expect htslib make will probably end up setting these on x86_64. It makes quite a few assumptions already, so I think this would be reasonable given that we can let configure override it if necessary.

jkbonfield commented 6 months ago

My point was we don't need to have x86_64 checks in the Makefile as they're already in the source files. We can just define these to 1 on all platforms and it'll have no consequences elsewhere. We don't need extra fragile or complex code for platform detection when not using autoconf.