rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library
https://docs.rapids.ai/api/cugraph/stable/
Apache License 2.0
1.77k stars 304 forks source link

[QST]: Should nx-cugraph GPU graph caching be opt-out rather than opt-in? #4467

Closed beckernick closed 3 months ago

beckernick commented 5 months ago

What is your question?

Converting a CPU NetworkX graph to a GPU graph can be slow, so nx-cuGraph provides a configuration option that caches the converted graph so users only need to pay this cost once regardless of how many algorithms they run on the graph (NETWORKX_CACHE_CONVERTED_GRAPHS=True).

This feels high-impact if users frequently run multiple algorithms per graph. Is there any reason we wouldn't want to turn this on by default when someone is using nx-cuGraph? E.g., are the graphs people frequently use with NetworkX algorithms particularly large in terms of memory? Or, do we expect people to be manually mutating the graphs?

Code of Conduct

eriknw commented 5 months ago

This is a fair question, thanks @beckernick.

The default configuration is chosen by NetworkX, and I think there is appetite for having the default be True. As I recall, the default was chosen to be False at first for safety reasons, because it is possible for users to modify the NetworkX graph in ways that don't invalidate the cache. We discussed changing the default to True (since, indeed, if you're using a backend you most likely want to cache conversions) if no UX issues come up. I'll bring this up as a thing to try to change for the next NetworkX release.

Additionally, we can override the config so that caching is enabled by default if nx-cugraph is installed in the environment. This would be easy to do, and we can update our docs to inform our users that we do this. @rlratzel what do you think of automatically enabling caching if nx-cugraph is installed? Alternatively, should this be a per-backend config, and each backend can choose its default in the backend_info entry point?

eriknw commented 5 months ago

We discussed this issue during our nx dispatch meeting this week and I think there was general agreement that we could enable caching by default. I opened a PR here: https://github.com/networkx/networkx/pull/7498

rlratzel commented 3 months ago

Yep, it's opt-out now, and the associated warning will also be made to be easily suppressed by users that are aware of the edge cases (https://github.com/networkx/networkx/pull/7497).