Open burlen opened 4 years ago
@burlen should we have this as an independent travis-ci test for the TECA_superbuild repo? or add it as an additional test in the TECA repo?
TECA_superbuild repo
@taobrienlbl can you authorize Travis-CI to integrate TECA_superbuild?
please use a docker based on the latest available Ubuntu release for this.
These are the extra libraries I needed to install for the superbuild to install (on fedora):
expat-devel (for udunit) libffi-devel (for Python) pcre-devel & zlib-devel (for swig. The included zlib install was not enough for swig, zlib-devel was needed) libtool (for mpi)
@burlen Should I add these libs to the superbuild?
I think not for libtool - this part of GNU OS, and better down via package managers for compatibility w/ other critical os level dependencies (ie compilers).
However, expat, libffi, zlib, pcre all seem reasonable additions to me.
see also #256
I had to install:
setuptools_scm
for python-dateutils
six
for cycler
m4
for NetCDF on ubuntu:20.04
There's a problem with SWIG not populating LDFLAGS
in TECA_superbuild/build/SWIG-prefix/src/SWIG-build/CCache/Makefile
:
Makefile:
CC=/usr/bin/cc
CFLAGS=-I/app/TECA_superbuild/build/include -O3 -march=native -mtune=native -DNDEBUG -Wall -W -I.
SWIG=swig
SWIG_LIB=../$(srcdir)/../Lib
EXEEXT=
LIBS= -lz
OBJS= ccache.o mdfour.o hash.o execute.o util.o args.o stats.o \
cleanup.o snprintf.o unify.o
HEADERS = ccache.h mdfour.h config.h config_win32.h
...
$(PACKAGE_NAME)$(EXEEXT): $(OBJS) $(HEADERS)
$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJS) $(LIBS)
SWIG-build-err:
/usr/bin/ld: cannot find -lz
installing zlibg1-dev
(ubuntu) or zlib-devel
(fedora) via package-managers fixed the problem.
Please install m4
using the package manager, this is p[art of the build system and should not be included in the superbuild
Please ping me w/ a branch name when you have this pushed
zlib
is already being installed by the superbuild and has been for some time. Perhaps SWIG needs a newer version of zlib. Either way we'll need to update all of the dependencies to the latest versions. Would you please do this?
I think we are using the latest version of zlib 1.2.11
released on January 15, 2017
Please update all the dependencies, not just zlib, to the newest version.
Sounds good
swig depends on pcre not pcre2
Okay will fix it
when you add a new package to the build make sure you print a status message when both enabled & disabled
Oh I only added the enabled message. Will add the disabled as well.
test_binary_stream_mpi
is failing because it's the only test that has ${MPIEXEC} -n 2
hard-coded
After investigating the available resources on Travis-CI, I found out that it has only one core with 2 hyperthreads. That's why it's failing as OpenMPI assigns a slot per core (1 slot). To allow hyperthreading we can use mpirun --use-hwthread-cpus ...
lscpu
output
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 2
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) CPU
Stepping: 7
CPU MHz: 2800.184
BogoMIPS: 5600.36
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32 KiB
L1i cache: 32 KiB
L2 cache: 1 MiB
L3 cache: 33 MiB
NUMA node0 CPU(s): 0,1
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
Error:
44/144 Test #44: test_binary_stream_mpi ...........................***Failed 0.02 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
test_binary_stream
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
I think the fix that makes sense is to change ${MPIEXEC} -n 2
to ${MPIEXEC} -n ${TEST_CORES}
I've setup github actions on the superbuild repo on ubuntu 20.04. This took about 4 hours of work. Github actions seems to have limited capability to Travis CI. Travis CI would still be useful, especially for testing on Mac OS.
the goal is to have testing for the superbuild similar to what we do for teca itself. This would be accomplished in a similar way, ex: minimal docker image, run the superbuild, send result to a cdash site. also apple mac os runs.
It is important because the superbuild continues to be a convenient way to install teca on specialized systems such as cori.