Bright-Computing / bic

Bright-Illumina collaboration
GNU General Public License v2.0
4 stars 5 forks source link

Deliver/verify that EasyBuild works on a Bright cluster with Lmod, in default rollout #6

Open fgeorgatos opened 8 years ago

fgeorgatos commented 8 years ago

This requires #5

Related question @plabrop :

EB can be delivered via a module created by bootstrap process; would that be the preferred way or, an rpm delivery is more appropriate; if so, who creates the rpm and how

boegel commented 8 years ago

Strictly speaking, this does not require #5 (EasyBuild also works with Tmod), but it makes sense to tackle #5 too.

If you need a .spec file for EasyBuild, go shop at https://github.com/openhpc/ohpc/blob/obs/OpenHPC_1.0.1_Factory/components/dev-tools/easybuild/SPECS/easybuild.spec

boegel commented 8 years ago

@fgeorgatos: not sure why you assigned this to me, what's expected from me here exactly? A test script for EasyBuild?

I suggest you run this:

#!/bin/bash

# stop on any error
set -e
# simply check on whether 'eb' command is available, and whether it runs
eb --version
# run EasyBuild framework test suite, using Lmod as modules tool
# (this takes a while, i.e. ~10-20m)
export TEST_EASYBUILD_MODULES_TOOL=Lmod
python -O -m test.framework.suite
# check if easyblocks are available
eb --list-easyblocks | grep ConfigureMake
# check if easyconfigs are available
eb --search ^intel | grep intel-2016a.eb
echo SUCCESS

It'll take a while, but if is passes, eb works (and there's not cruft left behind).

fgeorgatos commented 8 years ago

yes, that's what was needed (I guess if that runs OK, anybody could say it's a green light for this issue!)

Perhaps, let's add to this a couple of complicated builds involving hierarchical namespaces (not sure if putting the bar too high is appropriate for this particular issue - let's decide that a bit later);

boegel commented 8 years ago

This should do it:

export EASYBUILD_MODULES_TOOL=Lmod
export EASYBUILD_MODULE_NAMING_SCHEME=HierarchicalMNS
eb GCC-4.9.3-2.25.eb --robot
eb HPL-2.1-foss-2016a.eb --robot
eb Python-2.7.11-foss-2016a.eb --robot
eb OpenFOAM-3.0.0-foss-2016a.eb --robot

You could add this to the 'quick' test:

eb OpenFOAM-3.0.0-foss-2016a.eb --dry-run
fgeorgatos commented 8 years ago

thanks for the recommendations! now we have concrete targets to work against

fgeorgatos commented 7 years ago

i.e. to pass the following test; it's looking good, so far: (OpenFOAM dry-run was piece of cake too)

$ export EASYBUILD_MODULE_NAMING_SCHEME=HierarchicalMNS
$ export EASYBUILD_INSTALLPATH=/tmp/ebtesting
$ echo GCC-4.9.3-2.25.eb HPL-2.1-foss-2016a.eb Python-2.7.11-foss-2016a.eb OpenFOAM-3.0.0-foss-2016a.eb |xargs -n1|xargs --replace -n1 echo time eb {} --robot|bash
fgeorgatos commented 7 years ago

i've tried that last one and it seems we get errors like:

[...]
== patching...
== preparing...
== FAILED: Installation ended unsuccessfully (build directory: /dev/shm/Automake/1.15/dummy-): build failed (first 300 chars): Changing environment as dictated by module failed: name 'false' is not defined (stdout:
false
, stderr: Lmod has detected the following error: These module(s) exist but cannot be
loaded as requested: "Autoconf/2.69"
   Try: "module spider Autoconf/2.69" to see how to load the module(s).

)
== Results of the build can be found in the log file(s) /tmp/eb-dawML5/easybuild-Automake-1.15-20170505.112047.ltpjs.log
ERROR: Build of /dev/shm/eb2.9.0/software/EasyBuild/2.9.0/lib/python2.7/site-packages/easybuild_easyconfigs-2.9.0-py2.7.egg/easybuild/easyconfigs/a/Automake/Automake-1.15.eb failed (err: 'build failed (first 300 chars): Changing environment as dictated by module failed: name \'false\' is not defined (stdout: \nfalse\n, stderr: Lmod has detected the following error: These module(s) exist but cannot be\nloaded as requested: "Autoconf/2.69"\n   Try: "module spider Autoconf/2.69" to see how to load the module(s).\n\n\n\n)')

real    2m4.144s
user    0m39.553s
sys     0m16.607s

to be continued, i need to identify/dig some more information on this one

fgeorgatos commented 7 years ago

ok, I've retried this last one with version 3.2.1 and it went fine - after a 2nd run we got:

[fgeorgatos@node002 ~]$ time eb GCC-4.9.3-2.25.eb HPL-2.1-foss-2016a.eb Python-2.7.11-foss-2016a.eb OpenFOAM-3.0.0-foss-2016a.eb -r
== temporary log file in case of crash /tmp/eb-E8UhpC/easybuild-BQB1yp.log
== Core/GCC/4.9.3-2.25 is already installed (module found), skipping
== MPI/GCC/4.9.3-2.25/OpenMPI/1.10.2/HPL/2.1 is already installed (module found), skipping
== MPI/GCC/4.9.3-2.25/OpenMPI/1.10.2/Python/2.7.11 is already installed (module found), skipping
== MPI/GCC/4.9.3-2.25/OpenMPI/1.10.2/OpenFOAM/3.0.0 is already installed (module found), skipping
== No easyconfigs left to be built.
== Build succeeded for 0 out of 0
== Temporary log file(s) /tmp/eb-E8UhpC/easybuild-BQB1yp.log* have been removed.
== Temporary directory /tmp/eb-E8UhpC has been removed.

real    2m0.300s
user    0m58.214s
sys     0m19.910s
[fgeorgatos@node002 ~]$

However, I still had the following failures in the test framework, everything else has worked:

This last one seemed to have failed because it was trying to modify a read-only filesystem, where EasyBuild indeed came from. If so, it's correct it failed! is that kind of intended behavior? :-(

fgeorgatos commented 7 years ago

OK: EasyBuild testing went reasonably well, too:

[fotis@demo2 Lmod_test_suite]$ cat eb_runme.sh
#!/bin/bash

# stop on any error
set -e
# simply check on whether 'eb' command is available, and whether it runs
eb --version
# run EasyBuild framework test suite, using Lmod as modules tool
# (this takes a while, i.e. ~10-20m)
export TEST_EASYBUILD_MODULES_TOOL=Lmod
python -O -m test.framework.suite
# check if easyblocks are available
eb --list-easyblocks | grep ConfigureMake
# check if easyconfigs are available
eb --search ^intel | grep intel-2016a.eb
[fotis@demo2 Lmod_test_suite]$
[fotis@demo2 Lmod_test_suite]$ ml EasyBuild
[fotis@demo2 Lmod_test_suite]$ eb --version
This is EasyBuild 3.2.1 (framework: 3.2.1, easyblocks: 3.2.1) on host demo2.
[fotis@demo2 Lmod_test_suite]$ time ./eb_runme.sh
This is EasyBuild 3.2.1 (framework: 3.2.1, easyblocks: 3.2.1) on host demo2.
INFO: This is (based on) vsc.install.shared_setup 0.10.26
WARNING: xmlrunner module not available, falling back to using unittest...

......Deprecated functionality, will no longer work in v10000000000000: almost kaput; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
Deprecated functionality, will no longer work in v10000000000000: almost kaput; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
Deprecated functionality, will no longer work in v10000000000000: almost kaput; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
Deprecated functionality, will no longer work in v10000000000000: almost kaput; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
Deprecated functionality, will no longer work in v10000000000000: almost kaput; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
........Skipping test_check_style, since pep8 is not available
...........Skipping test_empty_pr, no GitHub token available?
......Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr_x, no GitHub token available?
...................Skipping test_new_pr_delete, no GitHub token available?
.Skipping test_new_pr_dependencies, no GitHub token available?
.Skipping test_new_update_pr, no GitHub token available?
........Skipping test_review_pr, no GitHub token available?
.....................(skipping GitRepository test)
.(skipping HgRepository test)
..(skipping SvnRepository test)
................................Skipping test_dep_graph, since pygraph is not available
....Skipping test_dump_autopep8, since autopep8 is not available
..............Deprecated functionality, will no longer work in v4.0: Named argument 'default_fallback' for get_easyblock_class is deprecated, use 'error_on_missing_easyblock' instead; see http://easybuild.readthedocs.org/en/latest/Deprecated-functionality.html for more information
......................................................................................................................................Skipping test_from_pr, no GitHub token available?
..............== installing extension ext1  (1/1)...
.................== installing extension ext1  (1/1)...
...Skipping test_download_repo, no GitHub token available?
.Skipping test_fetch_easyconfigs_from_pr, no GitHub token available?
.Skipping test_fetch_latest_commit_sha, no GitHub token available?
.Skipping test_find_easybuild_easyconfig, no GitHub token available?
..Skipping test_install_github_token, no GitHub token available?
.Skipping test_read, no GitHub token available?
.Skipping test_read_api, no GitHub token available?
.Skipping test_validate_github_token, no GitHub token available?
.Skipping test_walk, no GitHub token available?
...........................................................................................................................................GC3Pie not available, skipping test
......................................Skipping trailing whitespace checks (no pycodestyle or pep8 available)
.Skipping style checks (no pycodestyle or pep8 available)
.
----------------------------------------------------------------------
Ran 492 tests in 1748.390s

OK
|-- ConfigureMake
|   |-- ConfigureMakePythonPackage
|   |   |-- ConfigureMakePythonPackage
|   |   |-- ConfigureMakePythonPackage
 * /home/fotis/.local/easybuild/software/EasyBuild/3.2.1/lib/python2.7/site-packages/easybuild_easyconfigs-3.2.1-py2.7.egg/easybuild/easyconfigs/i/intel/intel-2016a.eb

real    29m11.506s
user    21m31.697s
sys 7m3.800s
[fotis@demo2 Lmod_test_suite]$ date
Tue Jun 20 23:40:11 CEST 2017
[fotis@demo2 Lmod_test_suite]$

[fotis@demo2 Lmod_test_suite]$ time tm .
TM Version: 1.7

Starting Tests:

     Started : 23:46:15 tst: 1/5 P/F: 0:0, rt/avail/does_avail_work/t1
      passed : 23:46:15 tst: 1/5 P/F: 1:0, rt/avail/does_avail_work/t1

     Started : 23:46:15 tst: 2/5 P/F: 1:0, rt/int_subshell/does_an_int_subshell_work/t1
        diff : 23:46:16 tst: 2/5 P/F: 1:1, rt/int_subshell/does_an_int_subshell_work/t1

     Started : 23:46:16 tst: 3/5 P/F: 1:1, rt/load/does_load_work/t1
      passed : 23:46:16 tst: 3/5 P/F: 2:1, rt/load/does_load_work/t1

     Started : 23:46:16 tst: 4/5 P/F: 2:1, rt/subshell/does_a_subshell_work/t1
      passed : 23:46:17 tst: 4/5 P/F: 3:1, rt/subshell/does_a_subshell_work/t1

     Started : 23:46:17 tst: 5/5 P/F: 3:1, rt/exist/does_module_exist/t1
      passed : 23:46:18 tst: 5/5 P/F: 4:1, rt/exist/does_module_exist/t1

Finished Tests

**************************************************************************************************************************************
*** Test Results                                                                                                                   ***
**************************************************************************************************************************************

Date:             Tue Jun 20 23:46:15 2017
TARGET:
Tag:              2017_06_20
TM Version:       1.7
Hermes Version:   2.6
Lua Version:      Lua 5.1
Total Test Time:  00:00:02.93

**************************************************************************************************************************************
*** Test Summary                                                                                                                   ***
**************************************************************************************************************************************

Total:   5
diff:    1
passed:  4

*******  *  ****   *********                                     ***************
Results  R  Time   Test Name                                     version/message
*******  *  ****   *********                                     ***************
passed   R  0.454  rt/avail/does_avail_work/t1
passed   R  0.373  rt/exist/does_module_exist/t1
passed   R  0.54   rt/load/does_load_work/t1
passed   R  0.881  rt/subshell/does_a_subshell_work/t1
diff     R  0.681  rt/int_subshell/does_an_int_subshell_work/t1

*******  ****************
Results  Output Directory
*******  ****************
diff     /home/fotis/Lmod_test_suite/rt/int_subshell/t1/2017_06_20_23_46_15-Linux-x86_64-does_a_subshell_work

real    0m3.048s
user    0m1.889s
sys 0m1.198s
[fotis@demo2 Lmod_test_suite]$
boegel commented 7 years ago

@fgeorgatos So, now all tests passed? What's different from before, only the read-only filesystem?

fgeorgatos commented 7 years ago

Wait, this is a different system

There is one more testing round missing, ideally with your 3.3.0dev branch (I'm sure you've done something there) over parallel ro fs

boegel commented 7 years ago

@fgeorgatos Haven't had time to try and fix the broken tests you reported, still hoping to find time for that...

fgeorgatos commented 7 years ago

OK, at least I hope we would have fixed the one complaining about read-only filesystem. Testing should never attempt to write to a 3rd place and even if it does, it should not fail!

F.