Closed jrwiebe closed 8 years ago
This is great. Baking PageRank as per these docs into warcbase would be perfect, as a way to extract an ordered list of relevant resources.
Probably worth merging with #183. The fork of @aliceranzhou's link-structure repo is at https://github.com/shamrt/link-structure. Are we getting close to being able to incorporate into warcbase, either this repo or docs?
Just pinging again. How close is this branch being ready to incorporate into main? (would be nice to include in the write-up of warcbase we're doing!)
I will take care of this shortly. On Mar 17, 2016 1:19 PM, "Ian Milligan" notifications@github.com wrote:
Just pinging again. How close is this branch being ready to incorporate into main? (would be nice to include in the write-up of warcbase we're doing!)
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/lintool/warcbase/issues/201#issuecomment-197983455
Once we obtain a graph representation of our site link structure within Spark/Warcbase, we will be able to further simplify operations that currently depend on other tools (e.g. Gephi, for PageRank).
https://spark.apache.org/docs/latest/graphx-programming-guide.html#pagerank