Open dPys opened 1 month ago
These changes to the centralized processing of chunks seem good. I like the general iterator_func
capability, and it still allows inclusion of common cases like "nodes".
So +1 on the the interface from me, though I'd like to hear from @Schefflera-Arboricola too. :)
Thanks for this!
Summary:
This PR introduces a new
execute_parallel
function that simplifies algorithm parallelization logic by obviating the need for separate joblib calls for each algorithm while enabling much greater flexibility (for developers and the user).Key Updates:
Add
execute_parallel
:iterator_func
to customize how data (nodes, edges, etc.) is iterated over.get_chunks
.Remove
create_iterables
:iterator_func
withinexecute_parallel
.Thread-safe Joblib config:
parallel_config
context manager that uses thread-local storage to manage Joblib settings (like backend/verbose) without interference during concurrent runs.Refactor
betweenness_centrality
&edge_betweenness_centrality
to useexecute_parallel
as a POC.Why This Matters:
execute_parallel
anditerator_func
make this adaptable for any graph algs that need parallelism.create_iterables
.TODO:
Add unit test forexecute_parallel
once the interface gets the green-lightPerform regression test to ensure equivalent functionalityexecute_parallel
in place of existing separate joblib calls