dccspeed / fractal

Apache License 2.0
28 stars 8 forks source link

Readme and Walkthrough #2

Closed rmurphy2718 closed 5 years ago

rmurphy2718 commented 5 years ago

Hi fractal,

Thank you for your contribution to the graph mining community. However, it may be safe to say that not everybody in the community is familiar with Java, Spark, etc. I am one of them. I have read your readme, but I am not sure how to use your application.

In step 3, you have code to build an Object. Where should I put that code? Is it a standalone file that I compile? What do I use to compile it?

Regarding step 4, I simply don't know what that means.

Would it be possible to add to the documentation/readme/tutorial to help a broader audience?

Thanks so much. Again, I will be indebted to your contribution once I know how to use it :)

Best, -Ryan Murphy

viniciusvdias commented 5 years ago

Hey Ryan, Of course, I am working on a better documentation, as you suggested. This will be available soon. Best, Vinicius Dias

rmurphy2718 commented 5 years ago

Thank you!

rmurphy2718 commented 5 years ago

What import statement do I use for FractalContext? and FractalGraph and vfractoid, etc etc?

viniciusvdias commented 5 years ago

Please, check the README in the pull request for this issue (#4). The working branch has partial information that may help you.

btw, the import statement is import br.ufmg.cs.systems.fractal.FractalContext

Others you may need: import br.ufmg.cs.systems.fractal.FractalGraph import br.ufmg.cs.systems.fractal.Fractoid import br.ufmg.cs.systems.fractal.pattern.Pattern import br.ufmg.cs.systems.fractal.util.Logging

rmurphy2718 commented 5 years ago

Oh, great, I will take a look now and see if I have any questions

rmurphy2718 commented 5 years ago

One thing I noticed is that citeseer is not in your data directory, so the built-in application example does not work. Or am I missing something?

rmurphy2718 commented 5 years ago

I tried to run your batch example using cube.graph and ran into a problem. Let me know if I should open this as a different issue, by the way. I'm not sure what you would prefer and it's fine for me to open another issue.

I guess the jar does not exist in the fractal build? Thanks.

steps=2 inputgraph=$FRACTAL_HOME/data/cube.graph alg=cliques ./bin/fractal.sh

I got this output

FRACTAL_HOME is set to /home/murph213/Downloads/Installers/fractal

SPARK_HOME is set to /usr/local/spark alg is set to 'cliques' inputgraph is set to '/home/murph213/Downloads/Installers/fractal/data/cube.graph' steps is set to '2' spark-submit --master local[1] --deploy-mode client \ --driver-memory 2g \ --num-executors 1 \ --executor-cores 1 \ --executor-memory 2g \ --class br.ufmg.cs.systems.fractal.FractalSparkRunner \ --jars /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar \ /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar \ al /home/murph213/Downloads/Installers/fractal/data/cube.graph cliques scratch 1 2 info 19/09/13 12:53:06 WARN Utils: Your hostname, RM-Satellite resolves to a loopback address: 127.0.1.1; using 192.168.2.3 instead (on interface eth0) 19/09/13 12:53:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 19/09/13 12:53:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/09/13 12:53:07 WARN DependencyUtils: Local jar /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar does not exist, skipping. 19/09/13 12:53:07 WARN DependencyUtils: Local jar /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar does not exist, skipping. 19/09/13 12:53:07 WARN SparkSubmit$$anon$2: Failed to load br.ufmg.cs.systems.fractal.FractalSparkRunner. java.lang.ClassNotFoundException: br.ufmg.cs.systems.fractal.FractalSparkRunner at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:238) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:806) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 19/09/13 12:53:07 INFO ShutdownHookManager: Shutdown hook called 19/09/13 12:53:07 INFO ShutdownHookManager: Deleting directory /tmp/spark-b89a1b4f-b7a3-4ade-a289-7b78d2d92c22`

viniciusvdias commented 5 years ago

I tried to run your batch example using cube.graph and ran into a problem. Let me know if I should open this as a different issue, by the way. I'm not sure what you would prefer and it's fine for me to open another issue.

I guess the jar does not exist in the fractal build? Thanks.

steps=2 inputgraph=$FRACTAL_HOME/data/cube.graph alg=cliques ./bin/fractal.sh

I got this output

FRACTAL_HOME is set to /home/murph213/Downloads/Installers/fractal SPARK_HOME is set to /usr/local/spark alg is set to 'cliques' inputgraph is set to '/home/murph213/Downloads/Installers/fractal/data/cube.graph' steps is set to '2' spark-submit --master local[1] --deploy-mode client \ --driver-memory 2g \ --num-executors 1 \ --executor-cores 1 \ --executor-memory 2g \ --class br.ufmg.cs.systems.fractal.FractalSparkRunner \ --jars /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar \ /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar \ al /home/murph213/Downloads/Installers/fractal/data/cube.graph cliques scratch 1 2 info 19/09/13 12:53:06 WARN Utils: Your hostname, RM-Satellite resolves to a loopback address: 127.0.1.1; using 192.168.2.3 instead (on interface eth0) 19/09/13 12:53:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 19/09/13 12:53:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/09/13 12:53:07 WARN DependencyUtils: Local jar /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar does not exist, skipping. 19/09/13 12:53:07 WARN DependencyUtils: Local jar /home/murph213/Downloads/Installers/fractal/build/libs/fractal-SPARK-2.2.0-all.jar does not exist, skipping. 19/09/13 12:53:07 WARN SparkSubmit$$anon$2: Failed to load br.ufmg.cs.systems.fractal.FractalSparkRunner. java.lang.ClassNotFoundException: br.ufmg.cs.systems.fractal.FractalSparkRunner at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:238) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:806) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 19/09/13 12:53:07 INFO ShutdownHookManager: Shutdown hook called 19/09/13 12:53:07 INFO ShutdownHookManager: Deleting directory /tmp/spark-b89a1b4f-b7a3-4ade-a289-7b78d2d92c22`

Ok, lets move this to another issue please.

viniciusvdias commented 5 years ago

About the citeseer dataset, it is missing in the master branch, check the working pull request #4. The file is here: https://github.com/dccspeed/fractal/tree/vd_doc_project_structure/data

rmurphy2718 commented 5 years ago

Sounds great, I'll move that into another issue and then take a look at the working pull request!

rmurphy2718 commented 5 years ago

In your MyMotifsApp, I think you should have object MyMotifsApp extends Logging { instead of object MyFractalApp extends Logging {

am I right?

At least because the directory structure already includes a MyFractalApp which messes up the build.

viniciusvdias commented 5 years ago

Yes, sure. My intend with MyFractalApp is to be a general template for applications. You can add you app file anywhere and pass the correct class name to the script that should work.

rmurphy2718 commented 5 years ago

Ok, got it, thank you.
I think the usage app=fsm|motifs|cliques|cliquesopt|gquerying|gqueryingnaive|kws

should start with alg, alg=fsm|motifs|cliques|cliquesopt|gquerying|gqueryingnaive|kws

Is that right?

viniciusvdias commented 5 years ago

Yep, my bad.