soundcloud / spark-pagerank

PageRank in Spark
https://soundcloud.github.io/spark-pagerank
MIT License
74 stars 14 forks source link

Working example/generic usecase #46

Open MaximusPrimus opened 5 years ago

MaximusPrimus commented 5 years ago

This seems to be great. Can you provide a few lines of code of a working example for this so it becomes easier to use for first-timers.

Thanks for the hep !

joshdevins commented 5 years ago

What exactly are you trying to do? There are "apps" that have example code in them or you can use those drivers directly on your data. E.g. https://github.com/soundcloud/spark-pagerank/blob/master/src/main/scala/com.soundcloud.spark.pagerank/PageRankApp.scala#L68-L91

Do you want to build a graph or just run PageRank on an existing graph? Check out the drivers.

joshdevins commented 5 years ago

See: https://github.com/soundcloud/spark-pagerank#usage

Feel free to do a PR to improve the documentation if you think there is something that can be clearer. I prefer not putting code in the README as it can drift from the implementation since it's not under test.

MaximusPrimus commented 5 years ago

Thank you for your reply. I went through the usage file and had doubts, created this question. I have a TSV file with (src, dst, weights). Downloaded the .jar file. My aim is to form a graph and run the weighted page rank. I followed the documentation and went through some code but this gives me an error :-

GraphBuilderApp.run(Array("--input=/temp/followers1.txt", "--output=/tmp"), spark) error : org.kohsuke.args4j.CmdLineException: "--input /temp/followers1.txt" is not a valid option at org.kohsuke.args4j.CmdLineParser.parseArgument(CmdLineParser.java:419) at com.soundcloud.spark.pagerank.GraphBuilderApp$.run(GraphBuilderApp.scala:30) ... 49 elided

Just wanted to see an example which could help me understand how to run the apps on my data.

Many thanks !

joshdevins commented 5 years ago

You need to submit a Spark job with the GraphBuilderApp as the driver/application. See: https://spark.apache.org/docs/latest/submitting-applications.html