wri / gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK
MIT License
10 stars 8 forks source link

Reuse analysis results for same full-window geometries from different locations #237

Closed danscales closed 4 months ago

danscales commented 4 months ago

Reuse analysis results for same full-window geometries from different locations

This is an optimization that Justin did a while ago for GFW in SummaryRDD.scala. I'm porting it to ErrorSummaryRDD.scala, which is what GFWPro uses.

If several locations overlap an entire window, then only do the analysis on that window once, and use it as one of the partial results for all those locations. This optimization is especially useful when locations have a lot of overlap, possibly because there are lots of large, overlapping, supply sheds, or because of the batching of lots of related lists in the ALERTS-Batch analysis.

I believe that the extra check to see if a geometry completely covers a window should be inexpensive when it doesn't apply, so this optimization shouldn't noticeably increase the cost for very small jobs.