dccspeed / fractal

Apache License 2.0
28 stars 8 forks source link

Problem running sample batch on a remote machine #8

Closed rmurphy2718 closed 5 years ago

rmurphy2718 commented 5 years ago

Dear Fractal,

I am also having problems trying your code remotely. In particular, I ran

steps=2 inputgraph=$FRACTAL_HOME/data/citeseer-single-label.graph alg=cliques ./bin/fractal.sh on a remote machine with

Spark version 2.4.3 Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_222

I am attaching the log

remotelog.log

Thank you!

viniciusvdias commented 5 years ago

Ok, I see two problems here.

See this command: /opt/spark//bin/spark-submit --master local[1] \ --deploy-mode client \ --driver-memory 2g \ --num-executors 1 \ --executor-cores 1 \ --executor-memory 2g \ --class br.ufmg.cs.systems.fractal.FractalSparkRunner \ --jars /homes/murph213/vinic_fractal/fractal/fractal-core/build/libs/fractal-core-SPARK-2.2.0.jar \ /homes/murph213/vinic_fractal/fractal/fractal-apps/build/libs/fractal-apps-SPARK-2.2.0.jar \ al /homes/murph213/vinic_fractal/fractal/data/citeseer-single-label.graph cliques scratch 1 2 info

  1. You are trying to execute remotely on spark but this execution is actually local because of the --master local[1] part. To fix this you should add the spark master configuration to your submission. Something like:
spark_master=spark://sparkhost:7077 steps=2 inputgraph=$FRACTAL_HOME/data/citeseer-single-label.graph alg=cliques ./bin/fractal.sh 

If you are running over yarn, it should be spark_master=yarn, etc. This link may help to clarify the role of master url in spark.

  1. Your execution is crashing because the input file is missing. This may be happening because your spark configuration may have the HDFS as the default file system for spark. In this case, to make sure spark will try to fetch a local file system file and not a HDFS file, you can prepend file:// to your file path. Something like:
steps=2 inputgraph=file://$FRACTAL_HOME/data/citeseer-single-label.graph alg=cliques ./bin/fractal.sh 

Again, it seems to be two unrelated issues.