atlarge-research / graphalytics-platforms-powergraph

Graphalytics implementation for PowerGraph
Apache License 2.0
4 stars 3 forks source link

run_benchmark.sh unable to link files correctly #3

Closed sampollard closed 7 years ago

sampollard commented 8 years ago

After packaging with mvn package -DskipTests, running ./run-benchmark.shyields

[ 50%] Linking CXX executable main

/home/users/spollard/graphalytics/PowerGraph/release/src/graphlab/libgraphlab.a(hdfs.cpp.o): In function `graphlab::hdfs::hdfs(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned short)':

/home/users/spollard/graphalytics/PowerGraph/src/graphlab/util/hdfs.hpp:110: undefined reference to `hdfsConnect'

/home/users/spollard/graphalytics/PowerGraph/release/src/graphlab/libgraphlab.a(hdfs.cpp.o): In function `graphlab::hdfs::~hdfs()':

/home/users/spollard/graphalytics/PowerGraph/src/graphlab/util/hdfs.hpp:115: undefined reference to `hdfsDisconnect'

collect2: error: ld returned 1 exit status

CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed

make[2]: *** [main] Error 1

CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed

make[1]: *** [CMakeFiles/main.dir/all] Error 2

Makefile:83: recipe for target 'all' failed

make: *** [all] Error 2

This can be (temporarily) patched by editing bin/standard/CMakeFile/main.dir/link.txt to include the -lhdfs flag. However, running the benchmark again yields

$ ./run-benchmark.sh 
grep: /disks/large/home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/config//granula.properties: No such file or directory
-- Configuring done
-- Generating done
-- Build files have been written to: /home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/bin/standard
[ 50%] Linking CXX executable main
/home/users/spollard/graphalytics/PowerGraph/deps/local/lib/libhdfs.a(hdfsJniHelper.o): In function `getJNIEnv':
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:404: undefined reference to `JNI_GetCreatedJavaVMs'
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:458: undefined reference to `JNI_CreateJavaVM'
collect2: error: ld returned 1 exit status
CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed
make[2]: *** [main] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed
make[1]: *** [CMakeFiles/main.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2
spollard@arya:~/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1🐺 vim bin/standard/CMakeFiles/main.dir/link.txt 
spollard@arya:~/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1🐺 ./run-benchmark.sh 
grep: /disks/large/home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/config//granula.properties: No such file or directory
-- Configuring done
-- Generating done
-- Build files have been written to: /home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/bin/standard
[ 50%] Linking CXX executable main
/home/users/spollard/graphalytics/PowerGraph/deps/local/lib/libhdfs.a(hdfsJniHelper.o): In function `getJNIEnv':
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:404: undefined reference to `JNI_GetCreatedJavaVMs'
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:458: undefined reference to `JNI_CreateJavaVM'
collect2: error: ld returned 1 exit status
CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed
make[2]: *** [main] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed
make[1]: *** [CMakeFiles/main.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Notice that this is with my LD_LIBRARY_PATH set to /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/

thegeman commented 8 years ago

Looking through the source code of the Graphalytics/PowerGraph implementation, HDFS is not used or supported as data source when running Graphalytics. Instead, the graph data is read directly from the directory given by graphs.root-directory on the machine running the Graphalytics benchmark process. PowerGraph then distributes the data internally.

To fix the compilation issues, I think PowerGraph needs to be compiled with --no-jvm.

@stijnh As main author of this implementation, can you confirm the above? If so, I will update the README to reflect this.

sampollard commented 8 years ago

Got it. I compiled PowerGraph with the following commands:

git checkout https://github.com/sampollard/PowerGraph
cd PowerGraph
./configure --no_jvm
cd /release/toolkits
make

After this works,

cd graphalytics-platforms-powergraph
mvn package
tar -xf graphalytics-0.3-powergraph-0.1
cd graphalytics-0.3-powergraph-0.1
# Assuming you have already configured graphalytics
cp -r /path/to/ldbc_graphalytics/config/ config
cp config-template/powergraph.properties config/
# Configure powergraph.properties as per the template
./run-benchmark

I think you also need to have the benchmark.properties in the same directory as powergraph.properties.