networkx / nx-parallel

A networkx backend that uses joblib to run graph algorithms in parallel.
BSD 3-Clause "New" or "Revised" License
34 stars 21 forks source link

WIP: Adding config to nx-parallel #68

Closed Schefflera-Arboricola closed 3 months ago

Schefflera-Arboricola commented 5 months ago

This PR adds the ability to modify the default joblib's parallel configs by integrating nx-parallel with networkx's config manager.

relevant config PRs in networkx: https://github.com/networkx/networkx/pull/7363 , https://github.com/networkx/networkx/pull/7225 ref. Config.md file in this PR for usage. ref. GSoC'24 Blog 6 for more context.

Thank you :)

Schefflera-Arboricola commented 4 months ago

For proper integration, I will need to change a few lines of code in all the algorithms. So far, I've modified square_clustering to work well with NetworkX's config, which included renaming cpu_count to get_n_jobs. However, I've reverted get_n_jobs back to cpu_count because the tests were failing, as other algorithms still use cpu_count. I'll proceed with updating the other algorithms once I receive the first review. I think changing the name across all algorithms right now will over-crowd this PR and will make it challenging to review. LMK if that seems good to you @dschult .

Thank you :)

dschult commented 3 months ago

After reading the blog about adding config I am more in favor of avoiding the use of the networkx config system for controlling the nx-parallel use of joblib. Said another way, I think it would be better to keep the user in the joblib config system and support no joblib related parameters in nx.config.

Duplicating the joblib parameters doesn't make sense to me. And maintaining two (or creating a third) config system seems to ask for confusion and duplication. Can we just rely on users using the joblib config system?

Schefflera-Arboricola commented 3 months ago

After reading the blog about adding config I am more in favor of avoiding the use of the networkx config system for controlling the nx-parallel use of joblib. Said another way, I think it would be better to keep the user in the joblib config system and support no joblib related parameters in nx.config.

Duplicating the joblib parameters doesn't make sense to me. And maintaining two (or creating a third) config system seems to ask for confusion and duplication. Can we just rely on users using the joblib config system?

nx-parallel is a networkx backend that uses joblib(right now), not a joblib backend for graphs algorithms. So, it makes sense that we recommend our users to use the networkx's config context manager over the joblib's. Because, if a networkx user wants to use nx-parallel with some other backend(s)(like, with nx-cugraphs -- like backend_priority=[cugraph, parallel]) then they would have to set the nx-parallel's configs in the nx.config because this user would not want to run their code with a joblib's context manager when running with nx-cugraphs' implementation(because it will probably fail). And doing what you are suggesting will deprive nx-parallel users of all those features and functionalities that lets(or will let) them use multiple networkx backends together and this would probably exclude nx-parallel from the networkx backend ecosystem.

And we only have one config system i.e. NetworkX's config, which can also be used as a context manager. And joblib's context manager will not work as expected with the config system implemented in this PR(as outlined in the blog you mentioned above).

The only maintenance needed will be to update the ParallelConfig class, if joblib ever changes the args of the parallel_config class, which I think is very unlikely to happen very frequently. Also if it does happen, joblib has an appropriately long deprecation period.

Also, the main concern of mine(also mentioned in the blog) with the current implementation in this PR, is that the fetching of the global configs in every function call is redundant but it is not very expensive. It's like converting ParallelGraph instance into a nx.Graph instance in every nx-parallel algorithm.

Thank you for your comments. I hope I addressed them all.

dschult commented 3 months ago

nx-parallel is a networkx backend that uses joblib(right now), not a joblib backend for graphs algorithms. So, it makes sense that we recommend our users to use the networkx's config context manager over the joblib's. Because, if a networkx user wants to use nx-parallel with some other backend(s)(like, with nx-cugraphs -- like backend_priority=[cugraph, parallel]) then they would have to set the nx-parallel's configs in the nx.config because this user would not want to run their code with a joblib's context manager when running with nx-cugraphs' implementation(because it will probably fail). And doing what you are suggesting will deprive nx-parallel users of all those features and functionalities that lets(or will let) them use multiple networkx backends together and this would probably exclude nx-parallel from the networkx backend ecosystem.

Can you help me understand this better? Are we expecting every networkx backend to support running in an environment with every other networkx backend's configs? Said another way, why do we need to worry about users having two nx backends in their priority order when making nx-parallel look for joblib config values? It seems like the nx-parallel backend will look for joblib config settings and the nx-cugraph backend will look for cugraph config settings. How would these break each other?

Schefflera-Arboricola commented 3 months ago

Can you help me understand this better? Are we expecting every networkx backend to support running in an environment with every other networkx backend's configs? Said another way, why do we need to worry about users having two nx backends in their priority order when making nx-parallel look for joblib config values? It seems like the nx-parallel backend will look for joblib config settings and the nx-cugraph backend will look for cugraph config settings. How would these break each other?

It looks like I should expand more on what I wrote in blog about how NetworkX's config works and how nx-parallel integrates with the NetworkX configuration system and why it's essential to have its configs in nx.config instead of relying solely on joblib's context manager.

The unified configuration management system within NetworkX (nx.config) ensures consistency and ease of use when dealing with different backends. Each backend, such as nx-parallel and nx-cugraph, will have its configurations managed within this unified system.

If each backend looked for its configurations separately (e.g., nx-parallel using joblib's context manager and nx-cugraph using its own system), it could lead to conflicts and difficulties in managing multiple backends. By having all backend configurations within nx.config, we ensure that they can coexist and be prioritized without breaking each other. Also this is easier for the user also to specify all the configs for all the backends at one place.

Addressing your questions

So, to summarise, keeping all backend configs within nx.config ensures consistency and ease of management, prevents conflicts that could arise from using multiple backends with separate configuration systems. By having nx-parallel configurations within nx.config, we ensure that it can work seamlessly with other NetworkX backends. And that's why we use nx.config over joblib's context manager. Also joblib's context manager cannot fix global configs like nx.config.backends.parallel.n_jobs = 8, it can only set configs in a context.

If you need further clarification or have additional questions, please let me know.

dschult commented 3 months ago

Thanks for that description of the advantages of using nx.config.backends over relying on other configuration systems. I still have questions/concerns and I'll try to describe them here.

Thanks for this discussion as I think I'm learning about config systems generally as part of it. :)

Schefflera-Arboricola commented 3 months ago

if each backend relied ..... complicated to me -- not simpler.

Sure, backends can have a different configs system if they want. But, in case of nx-parallel, if we only use joblib's config system, there are certain issues, like as mentioned above, a user won't be able to define global configs as joblib only offers context manager parallel_config to specify/update the configs, because for joblib these are not config options but parameters. So, then if the user only uses joblib’s context manager and puts all their code in a joblib.parallel_config context then while running the nx-cugraph implementation the joblib’s context manager will not be ignored and joblib will be used, which is not what a user would expect.

And one does not need to “put all the config options for all the backends in a single line of code”. These are the default configs:

NetworkXConfig(
    backend_priority=[cugraph, parallel],
    backends=Config(
        cugraph=<configs_for_cugraphs>,
        parallel=ParallelConfig(
            backend='loky',
            n_jobs=None,
            verbose=0,
            temp_folder=None,
            max_nbytes='1M',
            mmap_mode='r',
            prefer=None,
            require=None,
            inner_max_num_threads=None,
            backend_params={}
        )
    ),
    cache_converted_graphs=True
)

And one can modify the ones they want like this:

nxp_configs = nx.config.backends.parallel
nxp_configs.backend = “threading”
nxp_configs.n_jobs = 7

nxc_configs = nx.config.backends.cugraph
…

I hope this looks less complicated to you.

When we make all .... all the backends).

As mentioned in the Config.md, one can use the nxp_configs(from above) just like they use joblib.parallel_config context manger, so a user does not need to learn a third config system. Also, someone using a networkx backend is expected to understand networkx's config, I think, and then they would learn config systems specific to the backend(which right now should be very easy for the nx-parallel backend). And I prefer it this way, because nxp_configs can be used as a global variable as well as a context.

In what way does... we are trying to avoid?

I’m not sure if I understand this one properly... but most of the code for managing different backends and their configs is in the main networkx repo not in nx-parallel. And the conflict is that to set the joblib’s configs we need to use the joblib’s context manager which won’t be ignored in nx-cugraphs implementation and also we cannot set global configs in joblib.

If I ....in nx-parallel"?

Yes, afaik joblib's context manager is the only way to set the config for joblib.Parallel. No, right now a user cannot use joblib’s context to set joblib’s parameters in nx-parallel. They can but it won’t work as expected. nx-parallel configs and joblib’s configs are independent but same. Also, in joblib n_jobs is a parameter not a config, so when I say ”globally“ or “global config” I mean it in nx-parallel not joblib.

Thank you :)

Schefflera-Arboricola commented 3 months ago

Made default n_jobs=-1 instead of None because I think this will be beneficial for the new users, as it would allow them to run their algorithms on all CPU cores(with the default backend="loky") without needing to understand or configure the parallel options in detail. LMK what you think. Thanks!

Schefflera-Arboricola commented 3 months ago

closing this PR. PR https://github.com/networkx/nx-parallel/pull/75 documents configuring using joblib and integrates networkx's config in nx-parallel. This PR attempted to create a unified config system but it's not perfect. Refer Issue https://github.com/networkx/nx-parallel/issues/76 for more. Thanks!