NERSC / shifter

Shifter - Linux Containers for HPC
Other
348 stars 65 forks source link

Cannot compile shifter from source in some debian-based systems #246

Open t0rrant opened 5 years ago

t0rrant commented 5 years ago

This happens with the following branches:

With the following systems:

When running make in the shifter root directory it crashes while compiling one of the dependencies libressl, which uses musl-gcc as a gcc-wrapper:

configure: error: in `/tmp/shifter-master/dep/build/libressl':
configure: error: C compiler cannot create executables
See `config.log' for more details
Makefile:566: recipe for target 'udiRoot_dep.tar' failed
make[2]: *** [udiRoot_dep.tar] Error 77
make[2]: Leaving directory '/tmp/shifter-master/dep'
Makefile:503: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/tmp/shifter-master'
Makefile:412: recipe for target 'all' failed
make: *** [all] Error 2

If I login to the VM and run make manually, in libressldirectory I get no errors, which does not use musl-gcc.

Shifter seems to compile just fine out of the box in:

What am I missing here? And/or how can I debug this?

Cheers

t0rrant commented 5 years ago

Further inspection of the config.log file for the libressl build I could trace the issue to ld:

configure:3227: checking for gcc
configure:3254: result: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc
configure:3483: checking for C compiler version
configure:3492: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc --version >&5
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

configure:3503: $? = 0
configure:3492: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc -v >&5
Using built-in specs.
Reading specs from /tmp/tmp.7dwAhgrgVO/lib/musl-gcc.specs
rename spec cpp_options to old_cpp_options
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- -$
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)
configure:3503: $? = 0
configure:3492: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc -V >&5
gcc: error: unrecognized command line option '-V'
gcc: fatal error: no input files
compilation terminated.
configure:3503: $? = 1
configure:3492: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc -qversion >&5
gcc: error: unrecognized command line option '-qversion'; did you mean '--version'?
gcc: fatal error: no input files
compilation terminated.
configure:3503: $? = 1
configure:3523: checking whether the C compiler works
configure:3545: /tmp/tmp.7dwAhgrgVO/bin/musl-gcc -Wall -std=gnu99 -g -O2 -D_DEFAULT_SOURCE -D_BSD_SOURCE -D_POSIX_SOURCE -D_GNU_SOURCE   conftest.c  >&5
/usr/bin/ld: /tmp/tmp.7dwAhgrgVO/lib/crt1.o: relocation R_X86_64_32S against symbol `_fini' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/6/crtbegin.o: relocation R_X86_64_32 against hidden symbol `__TMC_END__' can not be used when making a shared object
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
t0rrant commented 5 years ago

modifying the file dep/build_ssh.sh to consider the latest stable versions:

makes it go a bit further, it no longer gives the previous error, but now:

configure:3581: checking whether the C compiler works
configure:3603: /tmp/tmp.IyWuY6SOoP/bin/musl-gcc    conftest.c  >&5
configure:3607: $? = 0
configure:3655: result: yes
configure:3658: checking for C compiler default output file name
configure:3660: result: a.out
configure:3666: checking for suffix of executables
configure:3673: /tmp/tmp.IyWuY6SOoP/bin/musl-gcc -o conftest    conftest.c  >&5
configure:3677: $? = 0
configure:3699: result: 
configure:3721: checking whether we are cross compiling
configure:3729: /tmp/tmp.IyWuY6SOoP/bin/musl-gcc -o conftest    conftest.c  >&5
configure:3733: $? = 0
configure:3740: ./conftest
./configure: line 3742: ./conftest: No such file or directory
configure:3744: $? = 127
configure:3751: error: in `/tmp/shifter-master/dep/build/libressl':
configure:3753: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
t0rrant commented 5 years ago

furthermore, changing the following in dep/build_ssh.sh

    ....
    11  MUSL_VERSION=1.1.21
    12  LIBRESSL_VERSION=2.8.3
    13  ZLIB_VERSION=1.2.11
    14  OPENSSH_VERSION=7.9p1
    ....
    54  ./configure "--prefix=${SPRT_PREFIX}" --enable-gcc-wrapper
    ....
    75  CC="${SPRT_PREFIX}/bin/musl-gcc" ./configure "--prefix=${SPRT_PREFIX}"
    ....
    83  CC="${SPRT_PREFIX}/bin/musl-gcc" ./configure "--prefix=${SPRT_PREFIX}" --enable-static
    ....
    96  LDFLAGS="-L${SPRT_PREFIX}/lib -L${SPRT_PREFIX}/lib64 -I${SPRT_PREFIX}/include" CC="${SPRT_PREFIX}/bin/musl-gcc" ./configure --without-pam "--with-ssl-dir=${SPRT_PREFIX}" "--with-zlib=${SPRT_PREFIX} --without-zlib-version-check" "--prefix=${INST_PREFIX}"
    ....

makes further progress, however it still fails:

root@dokken:/tmp/shifter-master# make
make  all-recursive
make[1]: Entering directory '/tmp/shifter-master'
Making all in dep
make[2]: Entering directory '/tmp/shifter-master/dep'
./build_mount.sh > build_mount.log
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   178  100   178    0     0    580      0 --:--:-- --:--:-- --:--:--   581
100 8176k  100 8176k    0     0  2089k      0  0:00:03  0:00:03 --:--:-- 2544k
configure: WARNING: linux/gsmmux.h: present but cannot be compiled
configure: WARNING: linux/gsmmux.h:     check for missing prerequisite headers?
configure: WARNING: linux/gsmmux.h: see the Autoconf documentation
configure: WARNING: linux/gsmmux.h:     section "Present But Cannot Be Compiled"
configure: WARNING: linux/gsmmux.h: proceeding with the compiler's result
configure: WARNING:     ## ------------------------------ ##
configure: WARNING:     ## Report this to kzak@redhat.com ##
configure: WARNING:     ## ------------------------------ ##
configure: WARNING: ncurses or slang library not found; not building cfdisk
configure: WARNING: libcap-ng library not found; not building setpriv
configure: WARNING: z library not found; not building cramfs
configure: WARNING: PAM header file not found; not building chfn_chsh
configure: WARNING: PAM header file not found; not building login
configure: WARNING: PAM header file not found; not building su
configure: WARNING: PAM header file not found; not building runuser
configure: WARNING: ncurses or ncursesw library not found; not building pg
configure: WARNING: ncurses library not found; not building setterm
./configure: line 26545: --variable=systemdsystemunitdir: command not found
./configure: line 26580: --exists: command not found
configure: WARNING: libpython not found; not building pylibmount
ar: `u' modifier ignored since `D' is the default (see `U')
ar: `u' modifier ignored since `D' is the default (see `U')
ar: `u' modifier ignored since `D' is the default (see `U')
ar: `u' modifier ignored since `D' is the default (see `U')
./build_ssh.sh > build_ssh.log
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  964k  100  964k    0     0   638k      0  0:00:01  0:00:01 --:--:--  638k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3287k  100 3287k    0     0   205k      0  0:00:16  0:00:16 --:--:--  230k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  593k  100  593k    0     0   389k      0  0:00:01  0:00:01 --:--:--  389k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1528k  100 1528k    0     0   843k      0  0:00:01  0:00:01 --:--:--  843k
ar: `u' modifier ignored since `D' is the default (see `U')
ar: `u' modifier ignored since `D' is the default (see `U')
d1_both.c: In function 'dtls1_retransmit_message':
d1_both.c:1106:3: warning: 'save_write_sequence' may be used uninitialized in this function [-Wmaybe-uninitialized]
   memcpy(S3I(s)->write_sequence, save_write_sequence,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       sizeof(S3I(s)->write_sequence));
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ar: `u' modifier ignored since `D' is the default (see `U')
ar: `u' modifier ignored since `D' is the default (see `U')
libtool: install: warning: relinking `libssl.la'
libtool: install: warning: relinking `libtls.la'
Makefile:571: recipe for target 'udiRoot_dep.tar' failed
make[2]: *** [udiRoot_dep.tar] Error 1
make[2]: Leaving directory '/tmp/shifter-master/dep'
Makefile:509: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/tmp/shifter-master'
Makefile:418: recipe for target 'all' failed
make: *** [all] Error 2
t0rrant commented 5 years ago

From musl's FAQ

What applications are compatible with musl?

and following the link to the compatibility page:

I see that in Shifter build_ssh.sh we use openssh instead of dropbear. This could be the cause for failure in compiling Shifter.

scanon commented 5 years ago

So the ssh daemon is an optional feature. It is a static build of sshd so that it can be used inside the container environment. It is handy for some frameworks that can use ssh for launching. If you don’t need the feature you can disable that in the configure.

t0rrant commented 5 years ago

Ok, nice to know. Using --disable-staticsshd did the trick.

Thanks for the pointer.

However, I guess it would be good that either that optional daemon works out of the box or it is disabled by default and activated on request. Thoughts?