easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
373 stars 699 forks source link

Separate terminfo library in ncurses breaks Python build #4049

Closed vanzod closed 7 years ago

vanzod commented 7 years ago

Following the addition of --with-termlib in ncurses-6.0.eb (see PR #3545), if Python gets built against this library using Python-2.7.12-foss-2016b.eb it fails since it cannot find the required symbols:

./libpython2.7.so: error: undefined reference to 'tgetnum'
./libpython2.7.so: error: undefined reference to 'PC'
./libpython2.7.so: error: undefined reference to 'BC'
./libpython2.7.so: error: undefined reference to 'UP'
./libpython2.7.so: error: undefined reference to 'tgetent'
./libpython2.7.so: error: undefined reference to 'tgetstr'
./libpython2.7.so: error: undefined reference to 'tgetflag'

If ncurses is built without that flag Python building completes successfully.

@ocaisa Suggestions?

ocaisa commented 7 years ago

I think this is something that can fixed in the python easyblock, when it sets up ncurses linking you can check if the termlib static library exists and if so append it to the static linking list.

vanzod commented 7 years ago

Oddly enough when I add libtinfo.a to the list of static libraries the linker complains that it is not PIC compliant. Any idea why I am seeing such error?

ocaisa commented 7 years ago

Now you've hit a major problem with using builds that have the dummy toolchains, I don't think you can specify toolchainopts for dummy which means you can't tell it to make pic libs without a (fairly trivial) hack. Why not use ncurses at the GCCcore level in the toolchains hierarchy? The GCCcore level was created as a much more controlled replacement for the system compiler.

On 19 Jan 2017 9:34 pm, "Davide Vanzo" notifications@github.com<mailto:notifications@github.com> wrote:

Oddly enough when I add libtinfo.a to the list of static libraries the linker complains that it is not PIC compliant. Any idea why I am seeing such error?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/hpcugent/easybuild-easyconfigs/issues/4049#issuecomment-273890686, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADqZtRRJRhCUcYq_cNhB1dIxTWkrsNUsks5rT8jxgaJpZM4LnfRR.



Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt


vanzod commented 7 years ago

@ocaisa Thanks for the clarification. The problem was that in my system although I have both the dummy and the GCCcore versions of ncurses 6.0, EB is picking the dummy for static linking, generating the missing symbols error. So we need to find a way to solve the problem in case the --with-termlib flag is used for ncurses.

boegel commented 7 years ago

@ocaisa you can build with -fPIC when a dummy toolchain is used by specifying CFLAGS via buildopts, cfr. https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/z/zlib/zlib-1.2.8.eb#L17

Updating the Python easyblock in case the ncurses is known to require additional libraries sounds like a good idea.

vanzod commented 7 years ago

@boegel See hpcugent/easybuild-easyblocks#1088

ocaisa commented 7 years ago

Yup, that was the fairly trivial hack I had in mind...but it sits outside the philosophy of the EB being aware of the compiler and it's capabilities

On 20 Jan 2017 8:05 pm, "Kenneth Hoste" notifications@github.com<mailto:notifications@github.com> wrote:

@ocaisahttps://github.com/ocaisa you can build with -fPIC when a dummy toolchain is used by specifying CFLAGS via buildopts, cfr. https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/z/zlib/zlib-1.2.8.eb#L17

Updating the Python easyblock in case the ncurses is known to require additional libraries sounds like a good idea.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/hpcugent/easybuild-easyconfigs/issues/4049#issuecomment-274152797, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADqZtWgdtVYIMflyrypr0BjuV1VhcgIsks5rUQVggaJpZM4LnfRR.



Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt


boegel commented 7 years ago

@ocaisa Assuming that the system C compiler is GCC is a pretty safe bet, no?

ocaisa commented 7 years ago

Sure, of course, but the fix is a hack when you compare it to the capabilities available when you use an understood compiler and the recommended approach of toolchainopts

boegel commented 7 years ago

https://github.com/hpcugent/easybuild-framework/pull/1233 is an opportunity to fix that

vanzod commented 7 years ago

@ocaisa Even with the "hack" in my PR we are still stuck at the fact that the terminfo library is not position independent. Here is the full error stack.


/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tgoto.o): requires dynamic R_X86_64_PC32 reloc against 'tparm' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tparm.o): requires dynamic R_X86_64_PC32 reloc against '_nc_prescreen' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tputs.o): requires dynamic R_X86_64_PC32 reloc against 'stdout' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(trim_sgr0.o): requires dynamic R_X86_64_PC32 reloc against 'tparm' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_error.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_hash.o): requires dynamic R_X86_64_PC32 reloc against '_nc_get_hash_info' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(doalloc.o): requires dynamic R_X86_64_PC32 reloc against 'realloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_baudrate.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_cur_term.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_napms.o): requires dynamic R_X86_64_PC32 reloc against '__errno_location' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_setup.o): requires dynamic R_X86_64_PC32 reloc against 'TABSIZE' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_ti.o): requires dynamic R_X86_64_PC32 reloc against '_nc_find_type_entry' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_ttyflags.o): requires dynamic R_X86_64_PC32 reloc against 'tcsetattr' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(name_match.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(read_entry.o): requires dynamic R_X86_64_PC32 reloc against 'malloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(access.o): requires dynamic R_X86_64_PC32 reloc against '__xstat' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(alloc_ttype.o): requires dynamic R_X86_64_PC32 reloc against 'malloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_captab.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(db_iterator.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(free_ttype.o): requires dynamic R_X86_64_PC32 reloc against '_nc_user_definable' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(getenv_num.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(home_terminfo.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_options.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_raw.o): requires dynamic R_X86_64_PC32 reloc against '_nc_set_tty_mode_sp' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(entries.o): requires dynamic R_X86_64_PC32 reloc against '_nc_head' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(init_keytry.o): requires dynamic R_X86_64_32 reloc against '_nc_tinfo_fkeys' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_has_cap.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(key_defined.o): requires dynamic R_X86_64_PC32 reloc against 'SP' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(add_tries.o): requires dynamic R_X86_64_PC32 reloc against 'calloc' which may overflow at runtime; recompile with -fPIC
./Modules/posixmodule.c:7578: warning: the use of `tempnam' is dangerous, better use `mkstemp'
./Modules/posixmodule.c:7631: warning: the use of `tmpnam_r' is dangerous, better use `mkstemp'
collect2: error: ld returned 1 exit status```
vanzod commented 7 years ago

Another thing I am trying to understand is why EB is picking the ncurses with the dummy toolchain instead of the one built with GCCcore, which would make much more sense. By looking through the attached log the problem seems to arise when loading the dependencies for libreadline. The weird thing is that if I run the same lmod command on the system, it returns the GCCcore/ncurses as dependency. And at this point I am totally lost. easybuild-Python-2.7.12-20170123.103930.sHAOs.log.txt

vanzod commented 7 years ago

I think I found the source of the issue. First of all let's start by saying that this affects only the Python-3.5.2 for foss-2016b and intel-2016b toolchains (maybe others but I am working with those two right now) since it has XZ-5.5.2 as a dependency. XZ-5.5.2 is in turn dependent upon the gettext-0.19.8 library. However EB (let's take the foss toolchain as example) cannot use an hypotetical gettext-0.19.8-GCC-5.4.0-2.26.eb since at this toolchain level (at the dummy level gettext dependencies are stripped down) it would depend on libxml2-2.9.4-GCC-5.4.0-2.26.eb which in turn depends on XZ-5.2.2-GCC-5.4.0-2.26.eb and here we have a loop of dependencies. For this reason EB builds XZ-5.2.2-GCC-5.4.0-2.26.eb with gettext-0.19.8.eb and this last one depends on ncurses-6.0.eb. At this point, when compiling Python-3.5.2 it fails because it links against the ncurses library with the stripped symbols. For the same reason it is also not possible to use a gettext-0.19.8-GCC-5.4.0-2.26.eb easyconfig since this would depend on the gettext at the dummy level.

Unfortunately I do not see right now a way to break the dependency loop and avoid linking against the dummy gettext. Suggestions?

ocaisa commented 7 years ago

It's not a bug, the problem is the XZ easyconfig, it explicitly asks for gettext with the dummy toolchain:

('gettext', '0.19.8', '', True),

The way to fix this would be to supply a gettext easyconfig at GCC or GCCcore level and remove the explicit specification of the dummy toolchain (the True argument in the dependency spec) for XZ

vanzod commented 7 years ago

@ocaisa Yes, I figured that out 10 minutes after I sent the message. Check the edited post to see why it is not possible to use a gettext-0.19.8-GCC-5.4.0-2.26.eb unless I remove the libxml2 dependency from it.

vanzod commented 7 years ago

Thanks to @boegel we finally found that the problem arises when trying to build Python when the GCC/5.4.0-2.26 module is already loaded in the environment and ncurses-6.0 is built with separate terminfo symbols at the dummy level. The reason why the dummy ncurses gets loaded instead of the GCCcore one comes from how $MODULEPATH is managed by EB. Here is the breakdown of the issue.

In the initial environment the GCC/5.4.0-2.26 is loaded. $MODULEPATH is then:

<prefix>/GCC/5.4.0-2.26:<prefix>/GCCcore/5.4.0:<prefix>/Core

EB issues a module use Core command (see here) which pushes the Core path on the top of the list:

<prefix>/Core:<prefix>/GCC/5.4.0-2.26:<prefix>/GCCcore/5.4.0

EB loads GCC/5.4.0-2.26 as dependency, which pushes the corresponding path to the top:

<prefix>/GCC/5.4.0-2.26:<prefix>/Core:<prefix>/GCCcore/5.4.0

However, since GCCcore/5.4.0 is a conditional dependency in the GCC/5.4.0-2.26 module and it is already present in $MODULEPATH, the corresponding path remains after the Core path, causing the Core/ncurses/6.0 library path to get added to LDFLAGS instead of the GCCcore/5.4.0/ncurses/6.0.

boegel commented 7 years ago

Long story short: don't load (EasyBuild-generated) modules before running EasyBuild.

We'll look into implementing a detection/warning mechanism (or maybe something stricter) for this in https://github.com/hpcugent/easybuild-framework/issues/153

pforai commented 7 years ago

One other thing to note is that the only dummy version of ncurses 6.0 builds with separate tinfo library while others do still included it. That may bite in some unexpected ways when one moves to include dummy in minimal toolchains and tries to rebuild a bigger tree ontop of this.

boegel commented 7 years ago

@pforai See https://github.com/easybuilders/easybuild-easyconfigs/pull/3545/files#r132546663 on that aspect.