hpcugent / Lmod-UGent

spec files of Lmod for UGent-HPC
8 stars 7 forks source link

bump to Lmod 6.5.7 #8

Closed boegel closed 7 years ago

boegel commented 7 years ago

Lmod 6.5.7 fixes a bug that we encountered in EasyBuild where a module for a dependency that sets $LD_PRELOAD (e.g. jemalloc) breaks subsequent loads of other dependencies (e.g. PCRE) when a Tcl module file is being loaded (e.g. MariaDB), due to EasyBuild resetting $LD_LIBRARY_PATH after having loading the toolchain first

This problem can be reproduced with:

$ ml intel/2016a && LD_LIBRARY_PATH='' LD_PRELOAD='' ml MariaDB/10.1.14-intel-2016a
sh: error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory
sh: error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory
Lmod has detected the following error:  Failed to find 'Lmod Capture Exit Code' in output: 
While processing the following module(s):
    Module fullname              Module Filename
    ---------------              ---------------
    PCRE/8.38-intel-2016a        /apps/gent/SL6/sandybridge/modules/all/PCRE/8.38-intel-2016a
    MariaDB/10.1.14-intel-2016a  /apps/gent/SL6/sandybridge/modules/all/MariaDB/10.1.14-intel-2016a

If you don't understand the warning or error, contact the helpdesk at hpc@ugent.be

Note that although the resetting of $LD_LIBRARY_PATH as done by EasyBuild should be followed by properly merging the original $LD_LIBRARY_PATH with the value obtained from the ml load statement; if not, lots of commands will be broken because of $LD_PRELOAD listing a library that needs other libraries which are no longer available via $LD_LIBRARY_PATH...

boegel commented 7 years ago

This issue arose while testing https://github.com/hpcugent/easybuild-easyconfigs/pull/3511, see (last) failing test reports.

boegel commented 7 years ago

@wpoely86 do _NOT_ merge this yet, WIP because:

boegel commented 7 years ago

Lmod 6.5.7 is now tagged, and works like a charm:

[20:17:04] vsc40023@test2802:~/vsc-testing/module $ ./run_all_tests.sh 
Lmod-6.5.7-1.ug.el6.noarch
vsc-cluster-modules-0.19-1.noarch
vsc-cluster-modules-tier2-0.19-1.noarch

> module --version

Modules based on Lua: Version 6.5.7  2016-09-02 12:54 -05:00 (CDT)
$MODULEPATH: /apps/gent/SL6/westmere/modules/all:/etc/modulefiles/vsc

*** 001_list.sh ***

> module list

>>> 001_list.sh: PASS

*** 002_avail.sh ***

> module avail 
> module avail GCC
> module avail GCC/4.9.3

>>> 002_avail.sh: PASS

*** 003_load.sh ***

> module load GCC
> module load GCC/4.9.3
> module load intel
> module load foss
> module load Python/2.7.11-intel-2016a
> module load GCC/4.9.3-2.25 OpenMPI/1.10.2-GCC-4.9.3-2.25 OpenBLAS/0.2.15-GCC-4.9.3-2.25-LAPACK-3.6.0 FFTW/3.3.4-gompi-2016a

>>> 003_load.sh: PASS

*** 004_purge.sh ***

> module load Python/2.7.11-intel-2016a
> module purge
> module load cluster
> module purge -force
> module load cluster
> module purge -force
> module load cluster/.banette

>>> 004_purge.sh: PASS

*** 005_swap.sh ***

> module load GCC/4.7.2
> module swap GCC/4.9.3
> module swap GCC GCC/4.7.2
> module swap GCC/4.7.2 GCC/4.9.3

>>> 005_swap.sh: PASS

*** 006_unload.sh ***

> module load GCC/4.9.3
> module unload GCC
> module load GCC/4.9.3
> module unload GCC/4.9.3

>>> 006_unload.sh: PASS

*** 007_spider.sh ***

> module spider intel
> module spider intel/2016a
> module --show-hidden spider intel/2016a

>>> 007_spider.sh: PASS

*** 010_stdout_stderr.sh ***

> module list
> module avail

>>> 010_stdout_stderr.sh: PASS

*** 050_ml.sh ***

> ml av GCC/4.9.3
> ml
> ml GCC/4.9.3
> ml
> ml -GCC/4.9.3
> ml

>>> 050_ml.sh: PASS

*** 051_collections.sh ***

> ml foss/2016a
> ml Python/2.7.11-intel-2016a
> ml save this_is_just_a_test_collection_for_module_integration_test_051
> ml describe this_is_just_a_test_collection_for_module_integration_test_051
> ml purge
> ml restore this_is_just_a_test_collection_for_module_integration_test_051
> ml purge
> module swap cluster/.banette cluster/delcatty
> ml purge
> ml restore this_is_just_a_test_collection_for_module_integration_test_051
> ml purge

>>> 051_collections.sh: PASS

*** 100_lmod_cache.sh ***

> module avail  # 2s time limit

>>> 100_lmod_cache.sh: PASS

*** 101_LD_LIBRARY_PATH.sh ***

> module load GCC/4.9.3-2.25
> module load OpenMPI/1.10.2-GCC-4.9.3-2.25
> checking $LD_LIBRARY_PATH...

>>> 101_LD_LIBRARY_PATH.sh: PASS

*** 102_symlink_modulepath.sh ***

> module use /tmp/vsc40023/PmsOLM/symlinked_modules
> module avail test/1.2.3

>>> 102_symlink_modulepath.sh: PASS

*** 103_load-via-list.sh ***

>>> 103_load-via-list.sh: PASS

*** 103_tcl2lua_LD_PRELOAD.sh ***

> module load jemalloc
> module show jemalloc

>>> 103_tcl2lua_LD_PRELOAD.sh: PASS

TEST RESULT: all 15 passed!
boegel commented 7 years ago

@wpoely86 please merge? ;)

boegel commented 7 years ago

@wpoely86 all is good with Lmod 6.5.8

I will look into an extra test for the $LD_PRELOAD crap that was fixed with Lmod 6.5.7 though

[10:26:17] vsc40023@test2802:~/vsc-testing/module $ ./run_all_tests.sh 
Lmod-6.5.8-1.ug.el6.noarch
vsc-cluster-modules-0.19-1.noarch
vsc-cluster-modules-tier2-0.19-1.noarch

> module --version

Modules based on Lua: Version 6.5.8  2016-09-03 13:41 -05:00 (CDT)
$MODULEPATH: /apps/gent/SL6/westmere/modules/all:/etc/modulefiles/vsc

*** 001_list.sh ***

> module list

>>> 001_list.sh: PASS

*** 002_avail.sh ***

> module avail 
> module avail GCC
> module avail GCC/4.9.3

>>> 002_avail.sh: PASS

*** 003_load.sh ***

> module load GCC
> module load GCC/4.9.3
> module load intel
> module load foss
> module load Python/2.7.11-intel-2016a
> module load GCC/4.9.3-2.25 OpenMPI/1.10.2-GCC-4.9.3-2.25 OpenBLAS/0.2.15-GCC-4.9.3-2.25-LAPACK-3.6.0 FFTW/3.3.4-gompi-2016a

>>> 003_load.sh: PASS

*** 004_purge.sh ***

> module load Python/2.7.11-intel-2016a
> module purge
> module load cluster
> module purge -force
> module load cluster
> module purge -force
> module load cluster/.banette

>>> 004_purge.sh: PASS

*** 005_swap.sh ***

> module load GCC/4.7.2
> module swap GCC/4.9.3
> module swap GCC GCC/4.7.2
> module swap GCC/4.7.2 GCC/4.9.3

>>> 005_swap.sh: PASS

*** 006_unload.sh ***

> module load GCC/4.9.3
> module unload GCC
> module load GCC/4.9.3
> module unload GCC/4.9.3

>>> 006_unload.sh: PASS

*** 007_spider.sh ***

> module spider intel
> module spider intel/2016a
> module --show-hidden spider intel/2016a

>>> 007_spider.sh: PASS

*** 010_stdout_stderr.sh ***

> module list
> module avail

>>> 010_stdout_stderr.sh: PASS

*** 050_ml.sh ***

> ml av GCC/4.9.3
> ml
> ml GCC/4.9.3
> ml
> ml -GCC/4.9.3
> ml

>>> 050_ml.sh: PASS

*** 051_collections.sh ***

> ml foss/2016a
> ml Python/2.7.11-intel-2016a
> ml save this_is_just_a_test_collection_for_module_integration_test_051
> ml describe this_is_just_a_test_collection_for_module_integration_test_051
> ml purge
> ml restore this_is_just_a_test_collection_for_module_integration_test_051
> ml purge
> module swap cluster/.banette cluster/delcatty
> ml purge
> ml restore this_is_just_a_test_collection_for_module_integration_test_051
> ml purge

>>> 051_collections.sh: PASS

*** 100_lmod_cache.sh ***

> module avail  # 2s time limit

>>> 100_lmod_cache.sh: PASS

*** 101_LD_LIBRARY_PATH.sh ***

> module load GCC/4.9.3-2.25
> module load OpenMPI/1.10.2-GCC-4.9.3-2.25
> checking $LD_LIBRARY_PATH...

>>> 101_LD_LIBRARY_PATH.sh: PASS

*** 102_symlink_modulepath.sh ***

> module use /tmp/vsc40023/GI7X3V/symlinked_modules
> module avail test/1.2.3

>>> 102_symlink_modulepath.sh: PASS

*** 103_load-via-list.sh ***

>>> 103_load-via-list.sh: PASS

*** 103_tcl2lua_LD_PRELOAD.sh ***

> module load jemalloc
> module show jemalloc

>>> 103_tcl2lua_LD_PRELOAD.sh: PASS

TEST RESULT: all 15 passed!