smnorris / bcfishpass

Model and monitor aquatic habitat connectivity in BC. Tools to plan and prioritize the assessment and remediation of barriers.
https://smnorris.github.io/bcfishpass
Apache License 2.0
8 stars 13 forks source link

scheduled data load workflows / source data cache? #467

Open smnorris opened 5 months ago

smnorris commented 5 months ago

Current workflows should be fine for replicating to production env.

For testing, lots of options:

a. run same scheduled loads on test as on prod b. do not run scheduled loads, test on old data c. run scheduled loads quarterly or similar d. do not run scheduled loads from sources, replicate from prod on schedule e. do not run scheduled loads to test, access latest data in prod via fdw

Seems easiest to start with option b, then move to d/e at a later date. Only caveat is that CABD and PSCIS should be refreshed on test when bcfishpass is run.

smnorris commented 1 month ago

With (at least) 4 different databases to replicate to, an intermediate cache of key inputs would reduce load on WFS and make data refreshes generally much faster. BC is now providing an object storage bucket, we could create a workflow/job that dumps these layers to file, to be picked up as needed by scheduled loads to the database:

In theory, we could get even more efficient - access models output to file from the BC db could be fed into the CWF dbs rather than re-processing.