easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
380 stars 704 forks source link

GMP doesn't respect optarch #5563

Open SethosII opened 6 years ago

SethosII commented 6 years ago

It seems there is some prolem with GMP. A reproducible crash occures when using curl (/usr/bin/curl) to get a website over https after loading GMP/6.1.1-foss-2016b:

> module load GMP/6.1.1-foss-2016b
> ldd /usr/bin/curl
        libcurl-gnutls.so.4 => /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007fad64df4000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fad64bda000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fad649bd000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fad645f3000)
        libidn.so.11 => /usr/lib/x86_64-linux-gnu/libidn.so.11 (0x00007fad643c0000)
        librtmp.so.1 => /usr/lib/x86_64-linux-gnu/librtmp.so.1 (0x00007fad641a4000)
        libnettle.so.6 => /usr/lib/x86_64-linux-gnu/libnettle.so.6 (0x00007fad63f6e000)
        libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007fad63c3e000)
        libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007fad639f4000)
        liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007fad637e5000)
        libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007fad63594000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fad65290000)
        libhogweed.so.4 => /usr/lib/x86_64-linux-gnu/libhogweed.so.4 (0x00007fad63361000)
        libgmp.so.10 => /easybuild/16.04/software/GMP/6.1.1-foss-2016b/lib/libgmp.so.10 (0x00007fad65421000) # <- the library from the module is used here
        libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007fad630fd000)
        libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007fad62eea000)
        libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007fad62c18000)
        libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007fad629e9000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007fad627e5000)
        libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007fad625da000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fad623bf000)
        libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007fad621a4000)
        libgssapi.so.3 => /usr/lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007fad61f63000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fad61c5a000)
        libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007fad61a52000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fad6184e000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007fad6164a000)
        libheimntlm.so.0 => /usr/lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007fad61441000)
        libkrb5.so.26 => /usr/lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007fad611b7000)
        libasn1.so.8 => /usr/lib/x86_64-linux-gnu/libasn1.so.8 (0x00007fad60f15000)
        libhcrypto.so.4 => /usr/lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007fad60ce2000)
        libroken.so.18 => /usr/lib/x86_64-linux-gnu/libroken.so.18 (0x00007fad60acc000)
        libwind.so.0 => /usr/lib/x86_64-linux-gnu/libwind.so.0 (0x00007fad608a3000)
        libheimbase.so.1 => /usr/lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007fad60694000)
        libhx509.so.5 => /usr/lib/x86_64-linux-gnu/libhx509.so.5 (0x00007fad60449000)
        libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007fad60174000)
        libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007fad5ff3c000)
> curl https://github.com
Illegal instruction (core dumped)
> strace curl https://github.com
# last lines from strace
recvfrom(4, 0x55deaaaaa38b, 5, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN}], 1, 1000)  = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=4, revents=POLLIN|POLLRDNORM}])
recvfrom(4, "\26\3\3\0p", 5, 0, NULL, NULL) = 5
recvfrom(4, "\2\0\0l\3\3\347o\376\256E\210\27dw\274\323ZNYS\260\262\247L\367\317\262\374un\276"..., 112, 0, NULL, NULL) = 112
recvfrom(4, "\26\3\3\fD", 5, 0, NULL, NULL) = 5
recvfrom(4, "\v\0\f@\0\f=\0\7}0\202\7y0\202\6a\240\3\2\1\2\2\20\v\375\264\t\n\327\265"..., 3140, 0, NULL, NULL) = 3140
brk(0x55deaaaf1000)                     = 0x55deaaaf1000
recvfrom(4, "\26\3\3\1M", 5, 0, NULL, NULL) = 5
recvfrom(4, "\f\0\1I\3\0\27A\4y\2\277+;\272|\324\335\230\\\261\310\266\310\22\322\310\357\303\3552\247"..., 333, 0, NULL, NULL) = 333
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x7f7ed0191b04} ---
+++ killed by SIGILL (core dumped) +++

I guess there may be problems with other software using GMP. This is the first I encountered. Has someone else encountered any problems with GMP?

OS: Ubuntu 16.04 system libgmp version: 6.1.0 (so I can't see why GMP 6.1.1 from EasyBuild shouldn't work) curl version: 7.47.0

boegel commented 6 years ago

@SethosII The Illegal instruction error is a strong suggestion that GMP was built for a different processor architecture than the one you are currently running on?

SethosII commented 6 years ago

@boegel Thank you for the suggestion. I compiled it with --optarch=march=corei7 which corresponds to our oldest architecture but it doesn't work there. Here is a overview:

processor architecture works? compiled here?
E5645 westmere no no
E5-2620 v2 ivybridge no no
E5-2630 v3 haswell yes yes

The naming scheme for the march options changed in GCC 5 (from this to that) but the old options are still accepted and the replacement for corei7 westmere is the same, only with some additional optimizations:

> module load foss/2016b
> diff -u <(echo | gcc -dM -E - -march=corei7) <(echo | gcc -dM -E - -march=westmere)
--- /dev/fd/63  2017-12-21 08:11:31.073452904 +0100
+++ /dev/fd/62  2017-12-21 08:11:31.077452881 +0100
@@ -103,6 +103,7 @@
 #define __nehalem__ 1
 #define __INT_FAST64_TYPE__ long int
 #define __DBL_MIN__ ((double)2.22507385850720138309e-308L)
+#define __PCLMUL__ 1
 #define __tune_corei7__ 1
 #define __LP64__ 1
 #define __DECIMAL_BID_FORMAT__ 1
@@ -237,6 +238,7 @@
 #define __LDBL_DENORM_MIN__ 3.64519953188247460253e-4951L
 #define __INT16_C(c) c
 #define __STDC__ 1
+#define __AES__ 1
 #define __PTRDIFF_TYPE__ long int
 #define __ATOMIC_SEQ_CST 5
 #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 1

If the option was unknown gcc would report an error:

> echo | gcc -dM -E - -march=wrong
cc1: error: bad value (wrong) for -march= switch

The option also seems to be passed correctly. Here is the build log: easybuild-GMP-6.1.1-20170110.135654.log (hooray for traceable builds provided by EasyBuild :)).

boegel commented 6 years ago

@SethosII The problem is probably that GMP doesn't honour what EasyBuild tells it, and it goes an builds for the system architecture you are on anyway.

There may be a configure option to tell it not to.

SethosII commented 6 years ago

@boegel Seems like that. Allthough, the march flag is passed correctly to gcc. I wrote to the developers of GMP. Lets see what they say.

SethosII commented 6 years ago

@boegel I got an answer (it basically was: No bug, RTFM!). It runs config.guess and uses the output from this to set the architecture. So I need to add --build=westmere-pc-linux-gnu to configure or just build it on the oldest architecture ...

Should we add a note on this in the EasyConfig?

boegel commented 6 years ago

@SethosII I'm not sure if adding a comment in every GMP easyconfig we have is the best approach here.

We could consider adding a custom easyblock for GMP that recognises a custom optarch value and tries to do the right thing, but I'm not sure that's worth the effort.

SethosII commented 6 years ago

@boegel Or maybe adding a list of known software ignoring the optarch flag here: http://easybuild.readthedocs.io/en/latest/Controlling_compiler_optimization_flags.html#build-environment-vs-hardcoding-in-build-scripts?

guest1604 commented 6 years ago

For a generic building of GMP, I use --enable-fat option :

Using --enable-fat selects a “fat binary” build on x86, where optimized low level subroutines are chosen at runtime according to the CPU detected. This means more code, but gives good performance on all x86 chips. (This option might become available for more architectures in the future.)

Reference : https://gmplib.org/manual/Build-Options.html

boegel commented 6 years ago

See also https://github.com/easybuilders/easybuild-easyblocks/pull/1336