tomalrussell / snkit

spatial networks toolkit (python)
MIT License
31 stars 10 forks source link

perf/parallel_split_edges_at_nodes #60

Closed thomas-fred closed 1 year ago

thomas-fred commented 1 year ago

When running link_nodes_to_nearest_edge for a large network (gridfinder, with 1.6M nodes, 3.6M edges), the majority of processing time is spent in split_edges_at_nodes. This PR is how I accelerated that task from ~days to an hour or so with process parallelism.

tomalrussell commented 1 year ago

Thanks for this @thomas-fred - clearly a useful speed-up, that I think we should merge in, even if it's a lone introduction of multiprocessing within the codebase for now.

Let's catch up to think through what a more performance-oriented design could look like.