Closed kmu-leeky closed 7 years ago
bd-2 container-id : matrix spark version 2.0.2 hadoop 2.6.0 https://github.com/PasaLab/marlin/tree/matrix-analysis-spark2.0
http://spark.apache.org/docs/2.0.2/building-spark.html#buildmvn
$ git clone https://github.com/PasaLab/marlin
$ cd marlin
$ git checkout matrix-analysis-spark2.0
$ cd spark-2.0.2-src/build/
$ chmod 755 mvn & cd ..
# Apache Hadoop 2.7.X and later
$ ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package
[error] /root/marlin/spark-2.0.2-src/graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:24: object lib is not a member of package org.apache.spark.graphx 기존의 spark-shell에서는 import org.apache.spark.graphx.lib._ 이상없음.
$ git clone https://github.com/apache/spark
$ cd spark
$ git checkout branch-2.0
$ cd graphx/src/main/scala/org/apache/spark/graphx
$ cp -r lib ~/marlin/spark-2.0.2-src/graphx/src/main/scala/org/apache/spark/graphx
spark-catalyst : Could not resolve dependencies for project org.apache.spark:spark-catalyst_2.11
$ ./dev/change-scala-version.sh 2.11
$ ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests -Dscala-2.11 clean package
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile (scala-test-compile-first) on project spark-sql_2.11: Execution scala-test-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile failed. CompileFailed -> [Help 1]
[ERROR] /root/marlin/spark-2.0.2-src/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:352: object creation impossible, since: it has 2 unimplemented members.
http://spark.apache.org/docs/2.0.2/building-spark.html#speeding-up-compilation-with-zinc
$ ./build/zinc-0.3.9/bin/zinc -shutdown # Build Failed <- (X)
$ ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests -Dscala-2.11 -DrecompileMode=all clean package
$ vi pom.xml
$ ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests -Dscala-2.11 -DrecompileMode=all -X -rf :spark-sql_2.11 clean package
./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests -Dscala-2.11 -DrecompileMode=all -X -rf :spark-sql_2.11 clean package
$ chmod 755 bin/* $ ./bin/spark-shell https://github.com/kmu-bigdata/distributed-matrixcompletion/blob/master/spark_square_matrix_matmul.scala 돌려본 결과
example 폴더의 MatrixMultiply.scala 를 실행해보고 close 하자
Marlin 의 성능 보다는 MatFast 와의 비교가 필요해 보임 - http://ieeexplore.ieee.org/document/7930046/ . 클로즈
https://github.com/PasaLab/marlin