kuanb / peartree

peartree: A library for converting transit data into a directed graph for sketch network analysis.
MIT License
201 stars 23 forks source link

[performance] WIP Parallelization of route edge and wait costing iteration #51

Closed kuanb closed 6 years ago

kuanb commented 6 years ago

Partially (incrementally) addressing issue https://github.com/kuanb/peartree/issues/12

Parallelizes target route processing operation process_route_edges_and_wait_times via dask distributed which allows for modular parallelization architecture which in the future could leverage external resources (useful for large graphs, tethering together whole regions, etc.).

TODO: Address GIL locking pandas operations triggering:

distributed.core - WARNING - Event loop was unresponsive in Scheduler for 1.18s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.

The solution here will be to also distribute sum of the subroutines being performed within each route processor.

Ensure dictionary roll up (array_bag) working as intended. Currently encountering an unintended NameError.

kuanb commented 6 years ago

Closing in favor or: https://github.com/kuanb/peartree/pull/53