wri / gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK
MIT License
10 stars 8 forks source link

GTC-2824 Use raster GADM layers for pro dashboard for small # of features #246

Closed danscales closed 2 months ago

danscales commented 3 months ago

GFWPro Dashboard currently reads in the entire vector GADM file (1.9GB) in order to do a spatial join with the user input features to determine the GADM areas in the input locations and dissolved lists.

Both reading the the GADM file and doing the spatial join can take many minutes, even in the case of a small number of input features. Instead, in that case, we change to using the raster GADM layers to determine the relevant gadm locations for the locations and dissolved lists. Because we lazily read tiles for the GFWPro dashboard (see following commit), we don't read any of the GADM rasters in the case that we are using the vector GADM.

In performance runs, I have seen current batch jobs with 20-50 features that run in 6-9 minutes, but take only about 1 minutes with the same batch configuration when using the raster GADM approach. This should be useful for reducing the cost of the hundreds of batch jobs used to update for the dashboards for all the smaller customer lists every week on Saturday night.