Please check the type of change your PR introduces:
- [ ] Bugfix
- [x] Feature
- [ ] Code style update (formatting, renaming)
- [ ] Refactoring (no functional changes, no api changes)
- [ ] Build related changes
- [ ] Documentation content changes
- [ ] Other (please describe):
## What is the current behavior?
Using RangePartitioner on z-index of tile
## What is the new behavior?
Using HashPartitioner
## Does this introduce a breaking change?
- [ ] Yes
- [x] No
## Other information
This is a performance oriented change.
In the context the work that's being done is fetching raster tiles, each named by a z-index and performing polygonal summary for overlapping geometries. The features are grouped per tile in the partition to avoid re-fetching tiles (which is expensive)
There are two cases where partitioning here can become a problem:
1. Heavy data skew on few tiles
There are only a few tiles and they have a lot of features. Ideally the features can be spread among several partitions so the work can be done in parallel and partitions don't lag
2. Multiple partitions fetch all the same tiles
This assumes there are "few" features per tile but now we can have the work spread among too many partitions and we're paying tile transfer costs for no reasons.
Thankfully because we're working over 1x1 degree grid here we can never have "too many" tiles overall and it's safe to collect the full distribution of our job before we decide how to partition it. This allows us to split any tile that have too many features (more than 4k) into multiple records (on different partitions). This works out because the index of each split is sequential and we can expect that hash to put them on different partitions as long as partition count is greater than split count (it will always be). The default case is for all the other features to be gathered on a single partition corresponding to their tile index. The order of tiles does not matter, because our unit of work is per tile, and we can hash over the partition count to try to get better distribution.
In testing so far this has been working well.
Some changes that may become desirable later is to decide how many tiles you want per-partition, this will give constant partition execution time (more or less).
Pull request type
Please check the type of change your PR introduces: - [ ] Bugfix - [x] Feature - [ ] Code style update (formatting, renaming) - [ ] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior? Using RangePartitioner on z-index of tile ## What is the new behavior? Using HashPartitioner ## Does this introduce a breaking change? - [ ] Yes - [x] No ## Other information This is a performance oriented change. In the context the work that's being done is fetching raster tiles, each named by a z-index and performing polygonal summary for overlapping geometries. The features are grouped per tile in the partition to avoid re-fetching tiles (which is expensive) There are two cases where partitioning here can become a problem: 1. Heavy data skew on few tiles There are only a few tiles and they have a lot of features. Ideally the features can be spread among several partitions so the work can be done in parallel and partitions don't lag 2. Multiple partitions fetch all the same tiles This assumes there are "few" features per tile but now we can have the work spread among too many partitions and we're paying tile transfer costs for no reasons. Thankfully because we're working over 1x1 degree grid here we can never have "too many" tiles overall and it's safe to collect the full distribution of our job before we decide how to partition it. This allows us to split any tile that have too many features (more than 4k) into multiple records (on different partitions). This works out because the index of each split is sequential and we can expect that hash to put them on different partitions as long as partition count is greater than split count (it will always be). The default case is for all the other features to be gathered on a single partition corresponding to their tile index. The order of tiles does not matter, because our unit of work is per tile, and we can hash over the partition count to try to get better distribution. In testing so far this has been working well. Some changes that may become desirable later is to decide how many tiles you want per-partition, this will give constant partition execution time (more or less).