wri / gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK
MIT License
10 stars 8 forks source link

Partition by tile count during summary #132

Closed echeipesh closed 2 years ago

echeipesh commented 2 years ago

Pull request type

Please check the type of change your PR introduces:

What is the current behavior?

Partition by total number of features.

What is the new behavior?

Partition by number of tiles needed to be downloaded. At minimum we'll respect default parallelism to give lowest runtime on small jobs. Otherwise cap at 16 tiles per partition which seems.

Does this introduce a breaking change?