nismod / open-gira

Open-data Global Infrastructure Risk/Resilience Analysis
https://nismod.github.io/open-gira/
MIT License
12 stars 3 forks source link

Poor parallelism given spatial heterogeneity of network density #84

Open thomas-fred opened 1 year ago

thomas-fred commented 1 year ago

Currently our parallelism is by spatial slices. For n^2 slices, we construct a grid of n by n square cells to cover the bounding box of the problem geometry. If, as is often the case, most elements lie within a handful of the n^2 cells, these slices take a long time to run, while the other slices are near empty and complete very quickly, delivering a poor degree of parallelism.

One solution to this is significantly increasing the slice count, so the areas with many network elements are subdivided. This does improve matters, but leads to a large number of jobs and file writing (emptier areas are also subdivided into cells containing very few network elements).

Another solution would be to slice the problem bounding box into a variable density mesh (with the resolution weighted by population density). I think a quadtree would fit within our current workflow without too much difficulty.