rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library
https://docs.rapids.ai/api/cugraph/stable/
Apache License 2.0
1.77k stars 304 forks source link

[QST]: Enable discussion tab on github repo? / nx-cugraph vs. cugraph benchmark #4273

Open raybellwaves opened 8 months ago

raybellwaves commented 8 months ago

What is your question?

Is there any interest in opening the discussion tab up in this repo? (https://docs.github.com/en/discussions/quickstart)

I find discussion a good place to host user questions that may not be suited for a traditional issue (bug or feature request).

If the discussion tab opens up i'll be happy to move this over and close this issue.

I watched the GTC talk by @MridulS and @rlratzel (https://register.nvidia.com/flow/nvidia/gtcs24/attendeeportaldigital/page/sessioncatalog?search=S61674&tab.allsessions=1700692987788001F1cG ; very cool talk, thanks a lot!)

On the benchmark slide (slide 16 in the provided slides) there were three rows: NetworkX, NetworkX + nx-cugraph (cold) and NetworkX + nx-cugraph. I was curious and also wanted to know what is the speed of pure cugraph i.e. what's the upper limit and what's the conversion costs to the NetworkX API. I'm think more along the lines of the cudf.pandas profiling to identify when dispatching happens to the CPU.

I create a colab notebook which you can find here where I timed betweenness_centrality for NetworkX, NetworkX + nx-cugraph (cold), NetworkX + nx-cugraph (warm) and cugraph. If interested you can find the results below:

k=10
NetworkX 2min 15s
NetworkX + nx-cugraph (cold) 17.6 s
NetworkX + nx-cugraph (warm) 13.1 s
cugraph 323 ms

Code of Conduct

eriknw commented 8 months ago

Hi @raybellwaves, thanks for the questions, suggestions, positive comments, and giving nx-cugraph a try!

Regarding "cold" vs "warm", "warm" means something different here (i.e., in your notebook, "warm" means the libraries have been imported and GPU context has been created, which is a reasonable interpretation). @rlratzel ran these benchmarks (and made that slide), and your question is a "what if...?" he was worried about. I believe "warm" in the benchmarking slide means the graph has already been converted to nx-cugraph and resides on the GPU. This requires either passing an nx-cugraph Graph to networkx, or enabling caching of backend graph conversions and having a cached nx-cugraph graph. We expect caching to come in NetworkX 3.3 (PR is open and ready to go in), which should be released soon.

Without caching, sometimes graph conversion can take a non-trivial amount of time for large graphs. We made it as fast as we could within reason (for example, it's much faster than nx.to_scipy_sparse_array), but it still needs to handle a lot of pure Python objects. NetworkX 3.3 is also adding "should_run", which lets NetworkX to ask backends whether it should convert to them to run an algorithm. We don't use this yet, but we plan to soon.

I really like the profiling idea! I know that's pretty slick with cudf.pandas. I bet we can do something similar--preferably in a generic way to support other backends.

Finally, I don't have a strong opinion on GH issues vs GH discussions. For now, it's fine to ask questions via issues.

CC @rlratzel, do you want to add anything?