Open crazyyanchao opened 5 years ago
Pagerank value of 0.15000000000000002
is the default value for nodes with no incoming relationships... seems like that no relationships get projected in the graph, which is weird given that you set NULL for relationship type, which should load all.
Hi @crazyyanchao,
Would you be able to share a small sample dataset that we can recreate this problem with? As @tomasonjo says it's weird why all the nodes have the initial PageRank value.
@mneedham @tomasonjo If I run: CALL algo.pageRank.stream(NULL,NULL,{iterations:20, dampingFactor:0.85}) YIELD node, score RETURN node.name, score ORDER BY score DESC The label ’专题‘ can figure out a value that looks reasonable. Dataset maybe can not share,sorry! Thanks for you reply!
@mneedham @tomasonjo @jexp @akollegger I execute two cypher on the same linkedin dataset,but the result vary enormously!
1.The first way
CALL algo.pageRank('LinkedinID', NULL, {iterations:20, dampingFactor:0.85, write: true,writeProperty:'pagerank'}) YIELD nodes, iterations, loadMillis, computeMillis, writeMillis, dampingFactor, write, writeProperty
MATCH (n:LinkedinID) RETURN n.name,n.pagerank ORDER BY n.pagerank DESC LIMIT 10
n.name | n.pagerank |
---|---|
"Dr. Imani Ma'at_29489954" | 238797044.98089278 |
"Kristina Tanasichuk_21342877" | 205712106.4265581 |
"Andy Jabbour_408109800" | 175523863.48403177 |
"Kim Proctor_2794998" | 170649994.17900914 |
"Michael Jacobs_3967109" | 142688564.25065896 |
"Adele Canetti11160947" | 105116298.79254237 |
"Marcia Stepanek_14481523n" | 90105381.10887711 |
"Christy Riccardi_11084249" | 78076928.37071984 |
"Gregg H._3628386" | 78046192.97161181 |
"Hollis Thomases_245341" | 75175480.38489856 |
"Jeff Molter_1411602" | 73882728.68542062 |
"Terezie Mosby_119305546" | 73044631.96094015 |
"Troy Stiner_91210468" | 71168889.09655812 |
"John Robitscher, MPH_8334935" | 70542084.97194709 |
2.The second way
CALL apoc.algo.pageRankWithCypher({iterations:20, write:true})
MATCH (n:LinkedinID) RETURN n.name,n.pagerank ORDER BY n.pagerank DESC LIMIT 10
n.name | n.pagerank |
---|---|
"Bill Gates_0" | 118.07033 |
"Richard Branson_0" | 101.64432 |
"Pete Brownell_18332101" | 77.71179 |
"Chuck Brooks_4888851" | 74.96686 |
"Dr. Nicholas R. Scheidt, PsyD, AADP_26394892" | 72.11293 |
"Mark Cuban_0" | 67.6066 |
"Frank T. Mitchell_14176906" | 67.3042 |
"Arianna Huffington_0" | 66.41209 |
"Jack Welch_0" | 63.05521 |
"Tarek Sobh_1564329" | 62.50482 |
Finally
I think the second way is more reasonable! But why didi that happen in the first way? I don't understand! Can you explain that? Thanks :)
Can you share this linkedin dataset?
I have installed two extendsion packages apoc-3.4.0.1-all.jar graph-algorithms-algo-3.4.7.0.jar
Performing pagerank on the same dataset varies hugely.(Database version:neo4j-community-3.4.7) 2.1、apoc-3.4.0.1-all.jar
2.2、graph-algorithms-algo-3.4.7.0.jar存在的过程(ALL score is 0.15000000000000002 )
The results were quite different ! Please tell me WHY? Thanks!!!