Open gh0stwheel opened 4 years ago
Hey @andrewpdickson !
I will crosscheck with @MeBrei once she's back, but I suspect you are spot on. Even if we consider the example graph in the paper:
Supposing we run the TrustRank phase and we realise that P1
and A1
need to be pruned, we will now end up in a situation like this:
This is clearly incorrect, because now the outgoing edges from P3 do not sum all to 1.
I suspect that doing this properly with our current graph implementation might not be trivial, but I will start thinking about this in preparation for Merle to be back 😉
Thanks for catching this!
For sure @adinapoli-mndc! :-)
Currently edge probabilities are "normalized" (made to sum to 1) as a pre-processing stage to running Osrank. However if a trusted seed set is used, then the final graph that Osrank runs on will be a subgraph of the original graph.
In that case we need to add a new step to our Osrank implementation where we re-normalize the edge probabilities in the graph after applying the Trustrank-style filter and before simulating the graph traversals.