Edge Probabilities Must be Re-Normalized in Graph for Subgraph Based on Trusted Seeds

gh0stwheel commented 4 years ago

Currently edge probabilities are "normalized" (made to sum to 1) as a pre-processing stage to running Osrank. However if a trusted seed set is used, then the final graph that Osrank runs on will be a subgraph of the original graph.

In that case we need to add a new step to our Osrank implementation where we re-normalize the edge probabilities in the graph after applying the Trustrank-style filter and before simulating the graph traversals.

adinapoli-mndc commented 4 years ago

Hey @andrewpdickson !

I will crosscheck with @MeBrei once she's back, but I suspect you are spot on. Even if we consider the example graph in the paper:

Supposing we run the TrustRank phase and we realise that P1 and A1 need to be pruned, we will now end up in a situation like this:

This is clearly incorrect, because now the outgoing edges from P3 do not sum all to 1.

I suspect that doing this properly with our current graph implementation might not be trivial, but I will start thinking about this in preparation for Merle to be back 😉

Thanks for catching this!

gh0stwheel commented 4 years ago

For sure @adinapoli-mndc! :-)

oscoin / osrank-rs

Edge Probabilities Must be Re-Normalized in Graph for Subgraph Based on Trusted Seeds #83