Poor parallelism given spatial heterogeneity of network density

Currently our parallelism is by spatial slices. For n^2 slices, we construct a grid of n by n square cells to cover the bounding box of the problem geometry. If, as is often the case, most elements lie within a handful of the n^2 cells, these slices take a long time to run, while the other slices are near empty and complete very quickly, delivering a poor degree of parallelism.

One solution to this is significantly increasing the slice count, so the areas with many network elements are subdivided. This does improve matters, but leads to a large number of jobs and file writing (emptier areas are also subdivided into cells containing very few network elements).

Another solution would be to slice the problem bounding box into a variable density mesh (with the resolution weighted by population density). I think a quadtree would fit within our current workflow without too much difficulty.

nismod / open-gira

Poor parallelism given spatial heterogeneity of network density #84