networkx / nx-parallel

A networkx backend that uses joblib to run graph algorithms in parallel.
BSD 3-Clause "New" or "Revised" License
34 stars 21 forks source link

WIP: Refactor Parallel Graph Algorithms to Use a Centralized Parallel Configuration with Flexible Iterators #86

Open dPys opened 1 month ago

dPys commented 1 month ago

Summary:

This PR introduces a new execute_parallel function that simplifies algorithm parallelization logic by obviating the need for separate joblib calls for each algorithm while enabling much greater flexibility (for developers and the user).

Key Updates:

  1. Add execute_parallel:

    • Centralizes logic for parallel execution across data chunks.
    • Uses iterator_func to customize how data (nodes, edges, etc.) is iterated over.
    • Supports both default and custom chunking via get_chunks.
  2. Remove create_iterables:

    • Simplified by moving to iterator_func within execute_parallel.
  3. Thread-safe Joblib config:

    • Added a parallel_config context manager that uses thread-local storage to manage Joblib settings (like backend/verbose) without interference during concurrent runs.
  4. Refactor betweenness_centrality & edge_betweenness_centrality to use execute_parallel as a POC.

Why This Matters:

TODO:

dschult commented 1 month ago

These changes to the centralized processing of chunks seem good. I like the general iterator_func capability, and it still allows inclusion of common cases like "nodes".

So +1 on the the interface from me, though I'd like to hear from @Schefflera-Arboricola too. :)

Thanks for this!