rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library
https://docs.rapids.ai/api/cugraph/stable/
Apache License 2.0
1.69k stars 300 forks source link

[ENH] Personalized Pagerank performed with float as weight edges #1313

Closed barondra closed 3 years ago

barondra commented 3 years ago

Hello, I want to perform a Personalized Pagerank (PPR) with float as weight edges in cuGraph. Please add this feature. I used Networkx for this and the result is good and as expected. However, Using cuGraph to do PPR is incredibly much faster compared with CPU. The edges I used are about seven thousand, but since not all vertex are important, I want to decrease the weight on the link.

Thank you

afender commented 3 years ago

Hi @barondra,

Thank you for your feedback. The feature you need should already be there. I'd recommend trying to set the personalization[‘values’]. If you believe there is a bug please fill a repro.

Link to doc : https://docs.rapids.ai/api/cugraph/nightly/api.html?highlight=personalized#module-cugraph.link_analysis.pagerank Relevant part :

personalizationcudf.Dataframe GPU Dataframe containing the personalization information.
     personalization[‘vertex’]cudf.Series Subset of vertices of graph for personalization
     personalization[‘values’]cudf.Series Personalization values for vertices
barondra commented 3 years ago

Dear @afender, Thank you for your kind reply. With all respect, I think you are mistaken by taking personalization values of PPR as weight edges. You see, when we use personalization values for a node as 1 and all other are zeros, it is considered as Random Walk with Restart (RWR). Although this RWR can be performed on cuGraph, however the graph edges are still weightless.

Please consider this stackoverflow thread to understand how NetworkX use RWR for graph with edges and without edges: https://stackoverflow.com/questions/61337245/how-to-implement-random-walks-with-restarts-in-python

I also put a screenshot for highlighting RWR without weight in edges: image RWR with weight in edges image

afender commented 3 years ago

Of course! Thanks for clarifying, I had personalization weights in mind indeed. This version of PageRank does not support edge weights , independently of personalization.

We know it is an important use case and have been working on upgrading it. We already have an experimental replacement supporting weights at the CUDA level (@seunghwak can provide more details on status and availability). It should be connected to the python API in future releases and will support distributed execution on multiple GPUs.

seunghwak commented 3 years ago

As @afender said, we have two versions of PageRank implementations, the first one does not use edge weights, and the second one uses edge weights if the input graph is weighted. The second version is currently used only in multi-GPU setting but once we do some additional performance tuning it will eventually replace the first one in single-GPU setting as well. This should happen in not too distant future.

barondra commented 3 years ago

Thank you so much for the help and efforts, everyone. I haven't tried the multi-GPU setting, but I would rather use other GPUs for ablation experiments. And to be fair, I typed weight edges while I meant to say weighted edges or edge weights. Looking forward for the next release!