Closed mhoemmen closed 3 years ago
@mhoemmen, the checkint-test.py script is not dead. I use it every day to test ATDM Trilinos builds.
Can you attach the generated file MPI_DEBUG_EX/configure.out
?
Thanks @bartlettroscoe ! Here are the relevant bits of that file. This may be a CMake options issue.
-- Found TPL 'Boost' include dirs '/projects/sems/install/rhel6-x86_64/sems/tpl/boost/1.59.0/gcc/4.9.3/base/include'
-- TPL_Boost_INCLUDE_DIRS='/projects/sems/install/rhel6-x86_64/sems/tpl/boost/1.59.0/gcc/4.9.3/base/include'
Processing enabled TPL: ParMETIS (enabled explicitly, disable with -DTPL_ENABLE_ParMETIS=OFF)
-- ParMETIS_LIBRARY_NAMES='parmetis;metis'
-- Searching for libs in ParMETIS_LIBRARY_DIRS='/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.9.3/openmpi/1.10.1/lib'
-- Searching for a lib in the set "parmetis":
-- Searching for lib 'parmetis' ...
-- NOTE: Did not find a lib in the lib set "parmetis" for the TPL 'ParMETIS'!
-- ERROR: Could not find the libraries for the TPL 'ParMETIS'!
-- TIP: If the TPL 'ParMETIS' is on your system then you can set:
-DParMETIS_LIBRARY_DIRS='<dir0>;<dir1>;...'
to point to the directories where these libraries may be found.
Or, just set:
-DTPL_ParMETIS_LIBRARIES='<path-to-libs0>;<path-to-libs1>;...'
to point to the full paths for the libraries which will
bypass any search for libraries and these libraries will be used without
question in the build. (But this will result in a build-time error
if not all of the necessary symbols are found.)
-- ERROR: Failed finding all of the parts of TPL 'ParMETIS' (see above), Aborting!
-- Performing Test HAVE_PARMETIS_VERSION_4_0_3
-- Performing Test HAVE_PARMETIS_VERSION_4_0_3 - Success
-- NOTE: The find module file for this failed TPL 'ParMETIS' is:
/scratch/prj/Trilinos/Trilinos/cmake/TPLs/FindTPLParMETIS.cmake
which is pointed to in the file:
/scratch/prj/Trilinos/Trilinos/TPLsList.cmake
TIP: Even though the TPL 'ParMETIS' was explicitly enabled in input,
it can be disabled with:
-DTPL_ENABLE_ParMETIS=OFF
which will disable it and will recursively disable all of the
downstream packages that have required dependencies on it.
When you reconfigure, just grep the cmake stdout for 'ParMETIS'
and then follow the disables that occur as a result to see what impact
this TPL disable has on the configuration of Trilinos.
CMake Error at cmake/tribits/core/package_arch/TribitsProcessEnabledTpl.cmake:144 (MESSAGE):
ERROR: TPL_ParMETIS_NOT_FOUND=TRUE, aborting!
Call Stack (most recent call first):
cmake/tribits/core/package_arch/TribitsGlobalMacros.cmake:1711 (TRIBITS_PROCESS_ENABLED_TPL)
cmake/tribits/core/package_arch/TribitsProjectImpl.cmake:202 (TRIBITS_PROCESS_ENABLED_TPLS)
cmake/tribits/core/package_arch/TribitsProject.cmake:93 (TRIBITS_PROJECT_IMPL)
CMakeLists.txt:90 (TRIBITS_PROJECT)
-- Configuring incomplete, errors occurred!
@mhoemmen can you please provide exact instructions to reproduce this error?
Using the standard checkin-test-sems.sh script, I was not able to reproduce this problem. For Trilinos 'develop' version 72985ec:
72985ec "Merge Pull Request #3621 from bartlettroscoe/Trilinos/3611-remove-sundance-inserted-package"
Author: trilinos-autotester <trilinos-autotester@trilinos.org>
Date: Sun Oct 14 19:46:11 2018 -0600 (22 hours ago)
I ran:
$ ./checkin-test-sems.sh --enable-packages=Teuchos --no-enable-fwd-packages --local-do-all
and the configure output showed:
Processing enabled TPL: ParMETIS (enabled explicitly, disable with -DTPL_ENABLE_ParMETIS=OFF)
-- ParMETIS_LIBRARY_NAMES='parmetis;metis'
-- Searching for libs in ParMETIS_LIBRARY_DIRS='/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/lib'
-- Searching for a lib in the set "parmetis":
-- Searching for lib 'parmetis' ...
-- Found lib '/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/lib/libparmetis.a'
-- Searching for a lib in the set "metis":
-- Searching for lib 'metis' ...
-- Found lib '/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/lib/libmetis.a'
-- TPL_ParMETIS_LIBRARIES='/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/lib/libparmetis.a;/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/lib/libmetis.a'
-- TPL_ParMETIS_INCLUDE_DIRS='/projects/sems/install/rhel6-x86_64/sems/tpl/parmetis/4.0.3/gcc/4.8.4/openmpi/1.10.1/parallel/include'
-- Performing Test HAVE_PARMETIS_VERSION_4_0_3
-- Performing Test HAVE_PARMETIS_VERSION_4_0_3 - Success
Processing enabled TPL: Zlib (enabled explicitly, disable with -DTPL_ENABLE_Zlib=OFF)
...
Finished configuring Trilinos!
-- Configuring done
-- Generating done
-- Build files have been written to: /home/rabartl/Trilinos.base/BUILDS/CHECKIN/MPI_RELEASE_DEBUG_SHARED_PT_OPENMP
@bartlettroscoe Let me see if I need to fix my CMake options -- thanks!
@mhoemmen, note that there is no automated testing that I know of with sems-cmake/3.12.2
so there might be a defect in that version of CMake (or behavior that is causing this problem). Not sure, but I coulid try to reproduce if you give me exact reproduciblility instructions.
I can confirm that the latest changes have broken the script, although this may have to do with the default sems environments and not just cmake. I get a bunch of errors like:
sems-openmpi/1.10.1(34):ERROR:102: Tcl command execution failed: if {[module-info mode switch]} {
set local_compiler_version $env(SEMS_OPENMPI_LOCAL_COMPILER_VERSION)
} elseif {[module-info mode remove]} {
set local_compiler_version $env(SEMS_OPENMPI_LOCAL_COMPILER_VERSION)
unsetenv SEMS_OPENMPI_LOCAL_COMPILER_VERSION
} else {
set local_compiler_version [semsModuleSupport::getCurrentVersion gcc]
setenv SEMS_OPENMPI_LOCAL_COMPILER_VERSION $local_compiler_version
}
for every loaded sems module. Have the requirements changed in terms of how to load modules prior to running these scripts? It all used to be automatic. Is there a module purge needed somewhere?
@bartlettroscoe @mhoemmen @trilinos/framework OK, this is actually pretty bad. After running the script, 'module purge' in my terminal window fails. Is there a new requirement on how a bashrc file must be set up if we want to use the checkin script? Mine simply contains
module load sems-devpack-gcc/6.1.0 module load sems-gdb module load sems-doxygen module load sems-git module load sems-cmake module load sems-tex module load sems-subversion
After I run the script, and run module purge, I get a bunch of errors. Then, module list returns:
Currently Loaded Modulefiles:
1) /projects/sems/modulefiles/rhel6-x86_64/sems/compiler/sems-gcc/6.1.0
2) /projects/sems/modulefiles/rhel6-x86_64/sems/compiler/sems-openmpi/1.10.1
3) /projects/sems/modulefiles/rhel6-x86_64/sems/compiler/sems-python/2.7.9
4) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-boost/1.63.0/base
5) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-hdf5/1.8.12/parallel
6) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-netcdf/4.4.1/exo_parallel
7) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-parmetis/4.0.3/64bit_parallel
8) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-scotch/6.0.3/nopthread_64bit_parallel
9) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-superlu/5.2.1/base
10) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-yaml_cpp/0.5.3/base
11) /projects/sems/modulefiles/rhel6-x86_64/sems/tpl/sems-zlib/1.2.8/base
Contrast that with running module list in the terminal before the script is run:
Currently Loaded Modulefiles:
1) sems-env 12) sems-gcc/6.1.0
2) sems-devpack-gcc/6.1.0 13) sems-openmpi/1.10.1
3) sems-gdb/7.9.1 14) sems-python/2.7.9
4) sems-doxygen/1.8.8 15) sems-boost/1.63.0/base
5) sems-git/2.10.1 16) sems-hdf5/1.8.12/parallel
6) sems-cmake/3.10.3 17) sems-netcdf/4.4.1/exo_parallel
7) sems-tex/2015 18) sems-parmetis/4.0.3/64bit_parallel
8) sems-apr/1.5.2 19) sems-scotch/6.0.3/nopthread_64bit_parallel
9) sems-apr_util/1.5.4 20) sems-superlu/5.2.1/base
10) sems-serf/1.3.8 21) sems-yaml_cpp/0.5.3/base
11) sems-subversion/1.7.19 22) sems-zlib/1.2.8/base
Note the differences in the presence/absence of directory prefixes. Something doesn't add up ...
Another piece of information: module purge fails in the terminal window regardless of the checkin script, with a bunch of messages of the type
Tcl command execution failed
Is this a SEMS issue?
@dridzal, let me talk a look and see what is happening by running the checkin-test-sems.sh script. STay tuned.
@bartlettroscoe , before you do that, just try running module purge in your terminal window. I get a bunch of errors. I load modules through bashrc. This started happening after I logged out and logged back into the system (so you may have to do the same). My modules are:
module load sems-devpack-gcc/6.1.0 module load sems-gdb module load sems-doxygen module load sems-git module load sems-cmake module load sems-tex module load sems-subversion
I have filed an issue with SEMS.
@dridzal,
On my CEE LAN RHEL6 machine 'ceerws11131 that loads the SEMS NFS env, I just ran:
$ module purge
[rabartl@ceerws1113 Trilinos (develop)]$ . cmake/load_sems_dev_env.sh
[rabartl@ceerws1113 Trilinos (develop)]$ module list
Currently Loaded Modulefiles:
1) sems-env 6) atdm-ninja_fortran/1.7.2 11) sems-hdf5/1.8.12/parallel
2) atdm-env 7) sems-gcc/4.8.4 12) sems-netcdf/4.4.1/exo_parallel
3) sems-python/2.7.9 8) sems-openmpi/1.10.1 13) sems-parmetis/4.0.3/parallel
4) atdm-cmake/3.11.1 9) sems-boost/1.63.0/base 14) sems-scotch/6.0.3/nopthread_64bit_parallel
5) sems-git/2.10.1 10) sems-zlib/1.2.8/base 15) sems-superlu/4.3/base
And then I did:
. cmake/load_sems_dev_env.sh sems-gcc/6.1.0
[rabartl@ceerws1113 Trilinos (develop)]$ module list
Currently Loaded Modulefiles:
1) sems-env 6) atdm-ninja_fortran/1.7.2 11) sems-hdf5/1.8.12/parallel
2) atdm-env 7) sems-gcc/6.1.0 12) sems-netcdf/4.4.1/exo_parallel
3) sems-python/2.7.9 8) sems-openmpi/1.10.1 13) sems-parmetis/4.0.3/parallel
4) atdm-cmake/3.11.1 9) sems-boost/1.63.0/base 14) sems-scotch/6.0.3/nopthread_64bit_parallel
5) sems-git/2.10.1 10) sems-zlib/1.2.8/base 15) sems-superlu/4.3/base
@bartlettroscoe I think the issue may be in loading the modules automatically through bashrc. Do you do this, or do you always load them manually? If I comment out the modules in bashrc, log out, and log back in, and then manually load the modules, it all seems to work. I don't know why this behavior has changed in the last few weeks.
Hum, looks like something is wrong with the SEMS modules on the CEE LAN. For the standard GCC 4.8.4 build for Trilinos with:
$ ./checkin-test-sems.sh --enable-packages=Kokkos --no-enable-fwd-packages \
--local-do-all --wipe-clean
I get all failing tests:
Configure: Passed (0.25 min)
Build: Passed (2.45 min)
Test: FAILED (0.01 min)
0% tests passed, 27 tests failed out of 27
Subproject Time Summary:
Kokkos = 3.96 sec*proc (27 tests)
Total Test time (real) = 0.46 sec
The following tests FAILED:
1 - KokkosCore_UnitTest_Serial_MPI_1 (Failed)
2 - KokkosCore_UnitTest_OpenMP_MPI_1 (Failed)
3 - KokkosCore_UnitTest_OpenMPInterOp_MPI_1 (Failed)
4 - KokkosCore_UnitTest_Default_MPI_1 (Failed)
5 - KokkosCore_UnitTest_PushFinalizeHook_MPI_1 (Failed)
6 - KokkosCore_UnitTest_PushFinalizeHook_terminate (Failed)
7 - KokkosCore_UnitTest_DefaultInit_1_MPI_1 (Failed)
8 - KokkosCore_UnitTest_DefaultInit_2_MPI_1 (Failed)
9 - KokkosCore_UnitTest_DefaultInit_3_MPI_1 (Failed)
10 - KokkosCore_UnitTest_DefaultInit_4_MPI_1 (Failed)
11 - KokkosCore_UnitTest_DefaultInit_5_MPI_1 (Failed)
12 - KokkosCore_UnitTest_DefaultInit_6_MPI_1 (Failed)
13 - KokkosCore_UnitTest_DefaultInit_7_MPI_1 (Failed)
14 - KokkosCore_UnitTest_DefaultInit_8_MPI_1 (Failed)
15 - KokkosCore_UnitTest_DefaultInit_9_MPI_1 (Failed)
16 - KokkosCore_UnitTest_DefaultInit_10_MPI_1 (Failed)
17 - KokkosCore_UnitTest_DefaultInit_11_MPI_1 (Failed)
18 - KokkosCore_UnitTest_DefaultInit_12_MPI_1 (Failed)
19 - KokkosCore_UnitTest_DefaultInit_13_MPI_1 (Failed)
20 - KokkosCore_UnitTest_DefaultInit_14_MPI_1 (Failed)
21 - KokkosCore_UnitTest_DefaultInit_15_MPI_1 (Failed)
22 - KokkosCore_UnitTest_DefaultInit_16_MPI_1 (Failed)
23 - KokkosCore_UnitTest_HWLOC_MPI_1 (Failed)
24 - KokkosCore_UnitTest_HostBarrier_MPI_1 (Failed)
25 - KokkosContainers_UnitTest_Serial_MPI_1 (Failed)
26 - KokkosContainers_UnitTest_OpenMP_MPI_1 (Failed)
27 - KokkosAlgorithms_UnitTest_MPI_1 (Failed)
Errors while running CTest
The runtime error shows:
1: /scratch/rabartl/Trilinos.base/BUILDS/CHECKIN/MPI_RELEASE_DEBUG_SHARED_PT_OPENMP/packages/kokkos/core/unit_test/KokkosCore_UnitTest_Serial.exe: /projects/sems/install/rhel6-x86_64/sems/compiler/gcc/4.8.4/base/lib64/libgomp.so.1: version `GOMP_4.0' not found (required by /scratch/rabartl/Trilinos.base/BUILDS/CHECKIN/MPI_RELEASE_DEBUG_SHARED_PT_OPENMP/packages/kokkos/core/unit_test/KokkosCore_UnitTest_Serial.exe)
I am guessing that the updated the CEE LAN RHEL6 machines and now some of the SEMS envs are now broken on these machines?
How is it that the Trilinos PR builds are not showing problems like this?
@dridzal, on my CEE LAN machine 'ceerws1113', I don't do anything to get the SEMS modules defined in my .bashrc
file or my .bash_profile
file. As soon as I log into that machine I have:
$ module list
No Modulefiles Currently Loaded.
$ module load sems-env
$ module load sems-git/2.10.1
$ module list
Currently Loaded Modulefiles:
1) sems-env 2) sems-git/2.10.1
What happens for you?
FYI: The sems-gcc/7.3.0 env seems to be fine on CEE RHEL6 machines. I just did:
$ . /scratch/rabartl/Trilinos.base/Trilinos/cmake/load_sems_dev_env.sh sems-gcc/7.3.0
$ module list
Currently Loaded Modulefiles:
1) sems-env 6) atdm-ninja_fortran/1.7.2 11) sems-hdf5/1.8.12/parallel
2) atdm-env 7) sems-gcc/7.3.0 12) sems-netcdf/4.4.1/exo_parallel
3) sems-python/2.7.9 8) sems-openmpi/1.10.1 13) sems-parmetis/4.0.3/parallel
4) atdm-cmake/3.11.1 9) sems-boost/1.63.0/base 14) sems-scotch/6.0.3/nopthread_64bit_parallel
5) sems-git/2.10.1 10) sems-zlib/1.2.8/base 15) sems-superlu/4.3/base
$ env TRILINOS_CHECKIN_TEST_SEMS_SKIP_MODULE_LOAD=1 \
./checkin-test-sems.sh --enable-packages=Kokkos --no-enable-fwd-packages --local-do-all \
--wipe-clean
and it returned:
Configure: Passed (0.22 min)
Build: Passed (2.71 min)
Test: Passed (0.61 min)
100% tests passed, 0 tests failed out of 27
Subproject Time Summary:
Kokkos = 106.76 sec*proc (27 tests)
Total Test time (real) = 36.86 sec
I don't understand how the Trilinos GCC 4.8.4 PR build is not also broken like this.
@bartlettroscoe Well, if I don't put anything into bashrc then I can load the modules manually like you did and things seem to work. But that means that every time I open a terminal I would have to load the modules. Even if I'm not working on code development, I need tex or git or subversion, etc. What is the proper way to automatically load modules?
It's somewhat to be expected that you might have to purge & reload modules now and then, no?
I logged out and logged back in again and this time when I ran:
$ ./checkin-test-sems.sh --enable-packages=Kokkos --no-enable-fwd-packages \
--local-do-all --wipe-clean
which returned:
Configure: Passed (0.18 min)
Build: Passed (2.22 min)
Test: Passed (0.71 min)
100% tests passed, 0 tests failed out of 27
Subproject Time Summary:
Kokkos = 117.39 sec*proc (27 tests)
Total Test time (real) = 42.38 sec
So everything seems to be okay.
@mhoemmen the issue is that if I load the modules in bashrc, then 'module purge' fails. I believe that the checkin script will attempt a purge when loading the sems environemnt -- right, @bartlettroscoe ? I guess I could write an alias to load the modules I need, and then run that alias command every time I open a terminal window, but that's still inconvenient. If I load the modules in bashrc then 'module purge' fails. 'module clear' will work, but you're prompted for confirmation, so we can't replace purge with clear in the checkin script.
the checkin script will attempt a purge when loading the sems environemnt -- right
Correct. It has to.
My CEE LAN setup loads the modules for .bash_profile by default and I have no problem. I can send you my .bashrc and .bash_profile files from my CEE RHEL6 machine offline for you to view and perhaps try out.
FYI:
After I re-synced the SEMS env to my CSRI RHEL6 machine, I logged back in again and did:
$ ./checkin-test-sems.sh --enable-packages=Kokkos --no-enable-fwd-packages \
--local-do-all --wipe-clean
and it passed 100% with:
Configure: Passed (0.11 min)
Build: Passed (0.89 min)
Test: Passed (0.67 min)
100% tests passed, 0 tests failed out of 27
Subproject Time Summary:
Kokkos = 108.49 sec*proc (27 tests)
Total Test time (real) = 40.31 sec
So that is it. You just need to make sure your SEMS env is up to date and you need to log out and log back in again and everything should work on a CEE RHEL6 machine and a CSRI (COE) RHEL6 machine using the SEMS modules.
@bartlettroscoe thanks for your input; following your example, I moved the module loads to the .bash_profile script (from the .bashrc script). After logging out, this seems to have done the trick, i.e., 'module purge' works again and the checkin script can proceed. The question remains why this behavior changed on my RHEL6 machine, and why it coincided with the move to cmake 3.10.0. In any case, this is resolved.
@dridzal, glad to hear this is resolved for you. Is this issue ready to close?
Yes, closed.
@dridzal, are their problems not yet resolved related this this?
@bartlettroscoe I'm reopening this issue and adding @fryeguy52 . There seems to be a fundamental difference in how the environment variables are processed by cmake or FindTPL after the cmake upgrade. The issue has been confirmed under RHEL6 by me and under RHEL7 by @fryeguy52 . For example, before this upgrade, I would load the modules, and cmake (and/or FindTPL) would find all TPLs. Now, I have to add the following lines to my configure script or my bash_profile:
export NetCDF_ROOT=$SEMS_NETCDF_ROOT export PNetCDF_ROOT=$SEMS_NETCDF_ROOT export HDF5_ROOT=$SEMS_HDF5_ROOT export Boost_ROOT=$SEMS_BOOST_ROOT export BOOST_ROOT=$SEMS_BOOST_ROOT
Obviously, doing this for every TPL and maintaining the script is error-prone. Just figuring out the capitalization of the variables is a huge pain. The question here is why NetCDF_ROOT, etc., were found prior to the cmake upgrade, and why they now must be manually set to the corresponding SEMS variable (SEMS_NETCDF_ROOT, etc.). On a related note, this doesn't seem to be an issue for check-in testing (!!), so I wonder how you got around manually setting the expected TPL variables to their SEMS equivalents. Further, when you load the SEMS modules, does
echo $NetCDF_ROOT
give you anything?
@dridzal, my advice is to follow the example of the Trilinos/cmake/std/sems/SEMSDevEnv.cmake file and just explicitly set the include directories and libraries. That remove any changes in find behavior for different versions of CMake or the find modules. Auto-find behavior changing is the number-one problem with porting problems with the configuration and building of software.
Can't you just use the SEMS env modules for your work?
Now, I have to add the following lines to my configure script or my bash_profile:
export NetCDF_ROOT=$SEMS_NETCDF_ROOT export PNetCDF_ROOT=$SEMS_NETCDF_ROOT export HDF5_ROOT=$SEMS_HDF5_ROOT export Boost_ROOT=$SEMS_BOOST_ROOT export BOOST_ROOT=$SEMS_BOOST_ROOT
@dridzal Fwiw, I've always specified TPLs locations in a base configure script:
-D SuperLU_LIBRARY_DIRS:PATH="${SEMS_SUPERLU_LIBRARY_PATH}"
-D SuperLU_INCLUDE_DIRS:PATH="${SEMS_SUPERLU_INCLUDE_PATH}"
All my other configure scripts call this base script. Whether a TPL is actually enabled is handled in the upper scripts, e.g., -D TPL_ENABLE_SuperLU:BOOL=ON
.
my advice is to follow the example of the Trilinos/cmake/std/sems/SEMSDevEnv.cmake file and just explicitly set the include directories and libraries.
@bartlettroscoe Looking at this file, you set, for example,
SEMS_SELECT_TPL_ROOT_DIR(NETCDF Netcdf_ROOT
PARALLEL_EXT "exo_parallel" SERIAL_EXT "exo")
#PRINT_VAR(Netcdf_ROOT)
SET(TPL_Netcdf_INCLUDE_DIRS "${Netcdf_ROOT}/include;${TPL_HDF5_INCLUDE_DIRS}"
CACHE PATH "Set in SEMSDevEnv.cmake")
SET(Netcdf_LIBRARY_DIRS "${Netcdf_ROOT}/lib;${HDF5_LIBRARY_DIRS}"
CACHE PATH "Set in SEMSDevEnv.cmake")
Who gives you Netcdf_ROOT? This is not available when you load the sems modules (it may have been at some point, but it definitely no longer is). Currently only SEMS_NETCDF_ROOT is available to query.
@jhux2 it seems like this (trivial) mapping of tpl_path to SEMS_tpl_path should be automated. This used to work; I never had to specify the mapping explicitly. So, somehow, in the recent upgrades we lost some of this functionality.
Who gives you Netcdf_ROOT? This is not available when you load the sems modules (it may have been at some point, but it definitely no longer is). Currently only SEMS_NETCDF_ROOT is available to query.
@dridzal, if you dig into SEMS_SELECT_TPL_ROOT_DIR()
, you will see it grabs the location of Netcdf from the env var SEMS_NETCDF_ROOT
. The var Netcdf_ROOT
is the returned CMake var, not an env var. That function SEMS_SELECT_TPL_ROOT_DIR()
works for MPI and non-MPI builds. This approach is a little more complex but it gives you both MPI and non-MPI builds with one *.cmake
file.
@bartlettroscoe , I see, thanks. So, how would I load the full SEMS env without having to worry about setting the paths manually? I basically want the full SEMS checkin test environment.
So, how would I load the full SEMS env without having to worry about setting the paths manually? I basically want the full SEMS checkin test environment.
@dridzal, everything you should need to know bout the SEMS env should be explained in:
The SEMS_<TPLNAME>_ROOT
vars are automatically set by the SEMS modules. So if you just source Trilinos/cmake/load_sems_dev_env.sh
with the right arguments and then pass in -C $TRILINOS_DIR/cmake/std/SEMSDevEnv.cmake
or (better in my opinion) -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/SEMSDevEnv.cmake
then that should find all of the SEMS-supported TPLs correctly.
NOTE: We we will be adding SuperLUDist and other TPLs as they are supported by SEMS (or installed in the 'trilinos' project space).
@bartlettroscoe , to recap, adding
-D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/SEMSDevEnv.cmake \
to my config script should do the trick? And I wouldn't have to add any TPLs explicitly?
to recap, adding
-D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/SEMSDevEnv.cmake \
to my config script should do the trick? And I wouldn't have to add any TPLs explicitly?
@dridzal, if you load the right SEMS modules, then yes. But note that you still have to enable the TPLs that you want to use such as with:
...
-D TPL_ENABLE_HDF5=ON \
-D TPL_ENABLE_Netcdf=ON \
...
If you use Trilinos/cmake/load_sems_dev_env.sh
to load the SEMS modules that should load all of the correct modules. The post-push CI build that I run and posts to:
uses these files so if they should every break we would know it right away.
That is all there is to it. Let me know if that does not work.
This no longer works. I have included the configure options file line as you suggested. Netcdf is enable because SEACAS is enabled. I get the error:
Processing enabled TPL: Netcdf (enabled by SEACASExodus, disable with -DTPL_ENABLE_Netcdf=OFF)
-- Using FIND_PACKAGE(Netcdf ...) ...
CMake Error at cmake/tribits/common_tpls/find_modules/FindNetCDF.cmake:163 (message):
Can not locate NetCDF include directory
Call Stack (most recent call first):
cmake/tribits/common_tpls/FindTPLNetcdf.cmake:66 (find_package)
cmake/tribits/core/package_arch/TribitsProcessEnabledTpl.cmake:106 (INCLUDE)
cmake/tribits/core/package_arch/TribitsGlobalMacros.cmake:1711 (TRIBITS_PROCESS_ENABLED_TPL)
cmake/tribits/core/package_arch/TribitsProjectImpl.cmake:202 (TRIBITS_PROCESS_ENABLED_TPLS)
cmake/tribits/core/package_arch/TribitsProject.cmake:93 (TRIBITS_PROJECT_IMPL)
CMakeLists.txt:90 (TRIBITS_PROJECT)
CMake Error at cmake/tribits/common_tpls/find_modules/FindNetCDF.cmake:274 (message):
Can not locate NetCDF C library
Call Stack (most recent call first):
cmake/tribits/common_tpls/FindTPLNetcdf.cmake:66 (find_package)
cmake/tribits/core/package_arch/TribitsProcessEnabledTpl.cmake:106 (INCLUDE)
cmake/tribits/core/package_arch/TribitsGlobalMacros.cmake:1711 (TRIBITS_PROCESS_ENABLED_TPL)
cmake/tribits/core/package_arch/TribitsProjectImpl.cmake:202 (TRIBITS_PROCESS_ENABLED_TPLS)
cmake/tribits/core/package_arch/TribitsProject.cmake:93 (TRIBITS_PROJECT_IMPL)
CMakeLists.txt:90 (TRIBITS_PROJECT)
-- NetCDF does not require HDF5
-- NetCDF does not require PNetCDF
-- Could NOT find NetCDF (missing: NetCDF_LIBRARIES NetCDF_INCLUDE_DIRS)
-- NetCDF Version:
-- NetCDF_NEEDS_HDF5 =
-- NetCDF_NEEDS_PNetCDF =
-- NetCDF_PARALLEL =
-- NetCDF_INCLUDE_DIRS = NetCDF_INCLUDE_DIR-NOTFOUND
-- NetCDF_LIBRARIES = NetCDF_C_LIBRARY-NOTFOUND
-- NetCDF_BINARIES = ncdump;ncgen;nccopy
-- Netcdf_LIBRARY_NAMES='netcdf'
-- Searching for libs in Netcdf_LIBRARY_DIRS=''
-- Searching for a lib in the set "netcdf":
-- Searching for lib 'netcdf' ...
-- NOTE: Did not find a lib in the lib set "netcdf" for the TPL 'Netcdf'!
-- ERROR: Could not find the libraries for the TPL 'Netcdf'!
So, it seems like this script doesn't work.
I've now tried:
source ../../cmake/load_sems_dev_env.sh sems-gcc/6.1.0
from my build directory, and I get the same error as above (missing Netcdf). My guess is that the checkin test script will no longer work either, or that if it does, it's a matter of auto-find luck.
Here is my config script:
EXTRA_ARGS=$@
cmake \
-D Trilinos_ENABLE_Fortran=OFF \
-D CMAKE_BUILD_TYPE:STRING=RELEASE \
-D TPL_ENABLE_MPI:BOOL=ON \
-D BUILD_SHARED_LIBS:BOOL=ON \
-D Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON \
-D TPL_ENABLE_Boost=ON \
-D Trilinos_ENABLE_Panzer:BOOL=ON \
-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
-D Trilinos_ENABLE_TESTS:BOOL=OFF \
-D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \
-D Panzer_ENABLE_TESTS:BOOL=ON \
-D Panzer_ENABLE_EXAMPLES:BOOL=ON \
-D TPL_BLAS_LIBRARIES:STRING=/usr/lib64/libblas.so.3 \
-D TPL_LAPACK_LIBRARIES:STRING=/usr/lib64/liblapack.so.3 \
$EXTRA_ARGS \
../../../Trilinos
Just try it. Load the SEMS env with
source ../../cmake/load_sems_dev_env.sh sems-gcc/6.1.0
then run the config script, and you'll immediately run into an issue with boost not being found. If you specify boost includes explicitly, and you get past them, then Netcdf won't be found. Etc.
The problem is that after running load_sems_dev_env.sh the variables Boost_INCLUDE_DIRS, NetCDF_INCLUDE_DIRS, etc., are not present in the shell. I'll try this on a CEE machine now.
@dridzal,
Sorry, there was a path typo at:
(I fixed it.)
You want add:
-D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/sems/SEMSDevEnv.cmake \
to your cmake
configure line. So that would give the configure script:
#!/bin/bash
cmake \
-D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/sems/SEMSDevEnv.cmake \
-D Trilinos_ENABLE_Fortran=OFF \
-D CMAKE_BUILD_TYPE:STRING=RELEASE \
-D TPL_ENABLE_MPI:BOOL=ON \
-D BUILD_SHARED_LIBS:BOOL=ON \
-D Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON \
-D TPL_ENABLE_Boost=ON \
-D Trilinos_ENABLE_Panzer:BOOL=ON \
-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
-D Trilinos_ENABLE_TESTS:BOOL=OFF \
-D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \
-D Panzer_ENABLE_TESTS:BOOL=ON \
-D Panzer_ENABLE_EXAMPLES:BOOL=ON \
"$@" \
../../../Trilinos
I put that in the executable script dridzal-configure
and did:
$ cd <build-dir>/
$ source ../../../Trilinos/cmake/load_sems_dev_env.sh sems-gcc/6.1.0
$ time ./dridzal-configure &> configure.out
real 0m32.138s
user 0m18.816s
sys 0m12.103s
That configured successfully for me. I am running the build now but I think that has it.
Let me know if this does not work for you.
Sorry again for the typo.
@bartlettroscoe that worked, thanks! I'm still amazed that the previous builds, with cmake-3.5.2, worked and that they found all the right SEMS libraries. I see two long-term solutions for these issues. One, we provide a mapping of TPL variables to SEMS variables like @jhux2 suggested, with all possible options relevant to Trilinos, and then have one cmake script load another "base" script. Two, we just use load_sems_dev_env, with the caveat that the selection of modules would have to be expanded.
Suggestions?
FYI: The checkin-test-sems.sh
script should be pretty well protected by the post-push CI build which uses these core files. Any errors in the checkin-test-sems.sh
not tested by that post-push CI build would be trivial to fix (as was the case previously).
Suggestions?
If you are using SEMS modules, always include cmake/std/sems/SEMSDevEnv.cmake
to pull in TPLs, period. As stated at:
at the bottom in the "NOTES" section:
So just load the module any way you would like. But honestly if people can not figure out how to type module list
after sourcing load_sems_dev_env.sh
(which that page shows you explicitly) then I don't know if we can help them.
@bartlettroscoe btw, how do we get the latest SEMS modules on a workstation or blade? Is there an FAQ somewhere? Thanks! :-)
@bartlettroscoe this is clear; what I meant by "the selection of modules would have to be expanded" is that the load_sems_dev_env.sh
script only loads a few TPLs currently. Can we expand it to load additional TPLs? There shouldn't be any harm in loading more than needed for any particular build. Is this planned already?
@mhoemmen does your question go beyond
module load sems-env
and then module-loading what you need? For a generic Trilinos build, you can bypass the manual module-loading as described above, through load_sems_dev_env.sh
and by including -D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/sems/SEMSDevEnv.cmake
in your config script.
@mhoemmen there is also this
and a bunch of related articles.
Can we expand it to load additional TPLs? There shouldn't be any harm in loading more than needed for any particular build. Is this planned already?
@dridzal, we should update the file load_sems_dev_env.sh
to load every TPL that SEMS provides that works with Trilinos and then we should update the file SEMSDevEnv.cmake
to pull in the info for all of those TPLs. The only TPL that SEMS provide that is not already handed in SuperLUDist. Are there others as well? We need to expect the set of TPLs that we test Trilinos with in PR, CI and Nightly testing to include all that we can that Trilinos customers are using.
btw, how do we get the latest SEMS modules on a workstation or blade? Is there an FAQ somewhere? Thanks! :-)
@mhoemmen, you mean a CEE LAN blade? It already does. I use this SEMS Dev Env stuff all the time on my CEE LAN RHEL6 machine 'ceerws1113'.
I'll try the sems article above; thanks!
The only TPL that SEMS provide that is not already handed in SuperLUDist. Are there others as well?
I think that's it.
@trilinos/framework @bartlettroscoe
The latest changes that require CMake 3.10.0 seem to have broken the check-in test script. I invoked the script like this:
with the following modules loaded:
I get the following output:
Should I consider the check-in test script dead? It was a useful tool & I'm sad to see it go.