ECP-VeloC / VELOC

Very-Low Overhead Checkpointing System
http://veloc.rtfd.io
MIT License
52 stars 21 forks source link

Check in changes needed to build using the IBM BB API #12

Closed tonyhutter closed 5 years ago

tonyhutter commented 5 years ago

Checking in fixes needed to use AXL with the BB API.

tonyhutter commented 5 years ago

I added in the changes because I was getting this:

$ make 
Scanning dependencies of target veloc-modules
[  5%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/module_manager.cpp.o
[ 10%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/client_watchdog.cpp.o
[ 15%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/transfer_module.cpp.o
[ 20%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/client_aggregator.cpp.o
[ 25%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/ec_module.cpp.o
[ 30%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/__/common/config.cpp.o
[ 35%] Linking CXX shared library libveloc-modules.so
[ 35%] Built target veloc-modules
Scanning dependencies of target veloc-backend
[ 40%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/main.cpp.o
[ 45%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/__/common/config.cpp.o
[ 50%] Linking CXX executable veloc-backend
../modules/libveloc-modules.so: undefined reference to `BB_InitLibrary'
../modules/libveloc-modules.so: undefined reference to `BB_GetTransferInfo'
../modules/libveloc-modules.so: undefined reference to `BB_AddFiles'
../modules/libveloc-modules.so: undefined reference to `BB_GetTransferHandle'
../modules/libveloc-modules.so: undefined reference to `BB_CancelTransfer'
../modules/libveloc-modules.so: undefined reference to `BB_CreateTransferDef'
../modules/libveloc-modules.so: undefined reference to `BB_GetLastErrorDetails'
../modules/libveloc-modules.so: undefined reference to `BB_StartTransfer'
../modules/libveloc-modules.so: undefined reference to `BB_TerminateLibrary'
../modules/libveloc-modules.so: undefined reference to `BB_FreeTransferDef'
collect2: error: ld returned 1 exit status
make[2]: *** [src/backend/CMakeFiles/veloc-backend.dir/build.make:130: src/backend/veloc-backend] Error 1
make[1]: *** [CMakeFiles/Makefile2:165: src/backend/CMakeFiles/veloc-backend.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

I'll need to dig a little more to see why VeloC would need these dependencies directly.

bnicolae commented 5 years ago

That's because VeloC is trying to link with the static version of AXL. I've updated it to prefer the dynamic version. Please link AXL with any libraries it needs directly.

tonyhutter commented 5 years ago

https://github.com/ECP-VeloC/VELOC/commit/8c2bd4ca692288ee6775abe298a2e78b8ea99f5a doesn't seem to fix it for me.

bash-4.2$ make
[ 35%] Built target veloc-modules
[ 40%] Linking CXX executable veloc-backend
/usr/tce/packages/gcc/gcc-7.3.1/rh/usr/bin/../libexec/gcc/ppc64le-redhat-linux/7/ld: warning: libbbAPI.so, needed by ../../install/lib64/libaxl.so, not found (try using -rpath or -rpath-link)
../../install/lib64/libaxl.so: undefined reference to `BB_InitLibrary'
../../install/lib64/libaxl.so: undefined reference to `BB_GetTransferInfo'
../../install/lib64/libaxl.so: undefined reference to `BB_AddFiles'
../../install/lib64/libaxl.so: undefined reference to `BB_GetTransferHandle'
../../install/lib64/libaxl.so: undefined reference to `BB_CancelTransfer'
../../install/lib64/libaxl.so: undefined reference to `BB_CreateTransferDef'
../../install/lib64/libaxl.so: undefined reference to `BB_GetLastErrorDetails'
../../install/lib64/libaxl.so: undefined reference to `BB_StartTransfer'
../../install/lib64/libaxl.so: undefined reference to `BB_TerminateLibrary'
../../install/lib64/libaxl.so: undefined reference to `BB_FreeTransferDef'
collect2: error: ld returned 1 exit status
make[2]: *** [src/backend/CMakeFiles/veloc-backend.dir/build.make:130: src/backend/veloc-backend] Error 1
make[1]: *** [CMakeFiles/Makefile2:165: src/backend/CMakeFiles/veloc-backend.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

bash-4.2$ git show
commit 8c2bd4ca692288ee6775abe298a2e78b8ea99f5a
Author: Bogdan Nicolae <bogdan.nicolae@acm.org>
Date:   Fri Mar 22 17:30:47 2019 -0500

    fixed AXL and ER dependencies
bnicolae commented 5 years ago

You are running make directly. Don't do that. Always run "auto-install.py --no-deps --no-boost ". Running make directly only works if the source code has changed but not the cmake configuration (which in this case has changed).

tonyhutter commented 5 years ago

Right, I should have mentioned that I deleted CMakeCache.txt before running my make. Were you able to get it to build? I still get the same error:

[hutter2@lassen709:VELOC]$   cmake -DCMAKE_BUILD_TYPE=Debug -DWITH_AXL_PREFIX=`pwd`/install -DWITH_ER_PREFIX=`pwd`/install -DCMAKE_INSTALL_PREFIX=`pwd`/install .
-- The C compiler identification is GNU 4.9.3
-- The CXX compiler identification is GNU 4.9.3
-- Check for working C compiler: /usr/tcetmp/bin/cc
-- Check for working C compiler: /usr/tcetmp/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/tcetmp/bin/c++
-- Check for working CXX compiler: /usr/tcetmp/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Boost version: 1.53.0
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found MPI_C: /usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpiprofilesupport.so;/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpi_ibm.so  
-- Found MPI_CXX: /usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpiprofilesupport.so;/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpi_ibm.so  
-- Found AXL: /g/g0/hutter2/VELOC/install/lib64/libaxl.so  
-- Found ER: /g/g0/hutter2/VELOC/install/lib64/liber.so;/g/g0/hutter2/VELOC/install/lib64/librankstr.so  
-- Configuring done
-- Generating done
-- Build files have been written to: /g/g0/hutter2/VELOC

[hutter2@lassen709:VELOC]$ make
Scanning dependencies of target veloc-modules
[  4%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/module_manager.cpp.o
[  9%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/client_watchdog.cpp.o
[ 13%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/transfer_module.cpp.o
[ 18%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/client_aggregator.cpp.o
[ 22%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/ec_module.cpp.o
[ 27%] Building CXX object src/modules/CMakeFiles/veloc-modules.dir/__/common/config.cpp.o
[ 31%] Linking CXX shared library libveloc-modules.so
[ 31%] Built target veloc-modules
Scanning dependencies of target veloc-backend
[ 36%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/main.cpp.o
[ 40%] Building CXX object src/backend/CMakeFiles/veloc-backend.dir/__/common/config.cpp.o
[ 45%] Linking CXX executable veloc-backend
/usr/bin/ld: warning: libbbAPI.so, needed by ../../install/lib64/libaxl.so, not found (try using -rpath or -rpath-link)
../../install/lib64/libaxl.so: undefined reference to `BB_InitLibrary'
../../install/lib64/libaxl.so: undefined reference to `BB_GetTransferInfo'
../../install/lib64/libaxl.so: undefined reference to `BB_AddFiles'
../../install/lib64/libaxl.so: undefined reference to `BB_GetTransferHandle'
../../install/lib64/libaxl.so: undefined reference to `BB_CancelTransfer'
../../install/lib64/libaxl.so: undefined reference to `BB_CreateTransferDef'
../../install/lib64/libaxl.so: undefined reference to `BB_GetLastErrorDetails'
../../install/lib64/libaxl.so: undefined reference to `BB_StartTransfer'
../../install/lib64/libaxl.so: undefined reference to `BB_TerminateLibrary'
../../install/lib64/libaxl.so: undefined reference to `BB_FreeTransferDef'
collect2: error: ld returned 1 exit status

[hutter2@lassen709:VELOC]$ git show
commit 55da483f60c6fa60481fbb8679dfa98d6261253b
Author: Bogdan Nicolae <bogdan.nicolae@gmail.com>
Date:   Sat Mar 23 15:58:41 2019 -0500

    Update .travis.yml

    Install OpenMPI binaries for Travis testing

diff --git a/.travis.yml b/.travis.yml
index 782ed3c..ab50600 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -2,7 +2,7 @@ language: c++

 before_install:
 - sudo apt-get update
-- sudo apt-get install -y libopenmpi-dev zlib1g-dev
+- sudo apt-get install -y libopenmpi-dev openmpi-bin zlib1g-dev

 script:
 - rm -rf $HOME/deploy
bnicolae commented 5 years ago

Tony, we are not testing VeloC with your latest AXL development code. You are supposed to do that. Once you are confident it is working, we will release a new AXL version and will have VeloC link against that AXL version. Before running make, set LD_LIBRARY_PATH to point to the location where your IBM BB libraries are installed.

tonyhutter commented 5 years ago

Apologies, I forgot to set LD_LIBRARY_PATH to include /opt/ibm/bb/lib. It's building now - sorry to bug you!

bnicolae commented 5 years ago

No worries, happy to help.