project-icp / bee-pollinator-app

The web application front end for the ICP Pollinator Decision Support Tool 🐝
Apache License 2.0
6 stars 1 forks source link

Gerenate VRTs for updated rasters in staging #586

Closed mmcfarland closed 4 years ago

mmcfarland commented 4 years ago

UMN provided 490 rasters (48 states, 5 index rasters each). They are available in the Azavea Dropbox account (see LastPass). Generate the 5km and 3km VRTs that reference these files, using the general instructions here and upload them to the beekeepers-staging-data-us-east-1 bucket.

Given these files are ~150GB and ultimately need to go from dropbox to S3 storage, this would should be performed on an ad-hoc EC2 instance in the ICP AWS account. The gdalbuildvrt command only reads metadata, so a micro or small instance should be sufficient.

  1. Launch an appropriately configured EC2 instance with enough block storage for the files
  2. Download the rasters from Dropbox (see https://github.com/dropbox/dbxcli or https://github.com/dropbox/dbxcli)
  3. Install GDAL and generate VRT

While generating VRTs, we'll want to avoid changing source code, so continue to use the filenaming conventions defined here:

https://github.com/project-icp/bee-pollinator-app/blob/3b207d0b2fad78993e2a21b263c3f4aebe1b884b/src/icp/apps/beekeepers/views.py#L112-L117

My suggested strategy that will allow cutover and rollback in case of problems are roughly:

  1. Script filename change. This could be appending a v2 (or similar) to each tif ifle, and then generating a VRT against the new names, or testing to see if relative paths work if we copy the source rasters into a subdirectory v2. Testing can be done via the dev environment, which can be altered to point to test bucket or test bucket directories.
  2. Locally backup existing VRT files and keep exisitng source rasters in the bucket
  3. Upload new rasters. They won't override files because of step 1.
  4. Upload new VRTs which override previous VRTs, and point to the new data
  5. Confirm the new data works, notify the client for evaluation

If there are significant problems, we can roll back to the old VRTs and the application can function as normal.

When there is sign off, clean up tasks are:

mmcfarland commented 4 years ago

Additional thoughts or notes