Open richwhitjr opened 8 years ago
Sounds interesting.
I wonder if a distributed version of this: https://oncue.github.io/quiver/
could be interesting.
It is based on work on functional APIs for graphs in haskell: http://web.engr.oregonstate.edu/~erwig/papers/InductiveGraphs_JFP01.pdf
Way back when I used a simple label propagation using HLL to count second followers and see the people who had the number of top 2nd follower (really unique 1st + 2nd). At the time, it was Obama, even though he was something like #5 in the top most followed.
On Wed, Jun 29, 2016 at 1:00 PM, Richard Whitcomb notifications@github.com wrote:
Anyone interested in a Graph Library for Scalding? I have an internal project that for the most part is based on the api of GraphX in Spark( https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala). I was thinking of pulling the generic bits into scalding itself as a new project folder.
The general idea is to be able to work on vertices and edges and allow for really quick intersection of neighbors. I have been using it to do triangle counting and conductance measurement on the Twitter follow graph.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twitter/scalding/issues/1577, or mute the thread https://github.com/notifications/unsubscribe/AAEJdpgVG-2wVwvp-TyQFc_SdY3oCRS8ks5qQvkCgaJpZM4JBpqX .
P. Oscar Boykin, Ph.D. | http://twitter.com/posco | http://pobox.com/~boykin
Yep my current thinking is to add an API at first to make these types of Graph walking jobs easier to reason about. Quiver looks interesting also but a very different API path then my current thinking.
Here is my current working example which can be generally though of as an abstracted version of GraphX.
https://gist.github.com/richwhitjr/ec27b4ee0405a48a5276016622371ce1
yeah, actually, I think making a distributed version of quiver is probably a whole separate project.
Would be cool to see yours based on spark.
Created a PR, still needs work(Undirected graphs are completely unsupported) but I wanted to open the idea up for discussion.
Anyone interested in a Graph Library for Scalding? I have an internal project that for the most part is based on the api of GraphX in Spark(https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala). I was thinking of pulling the generic bits into scalding itself as a new project folder.
The general idea is to be able to work on vertices and edges and allow for really quick intersection of neighbors. I have been using it to do triangle counting and conductance measurement on the Twitter follow graph.