Closed jesteria closed 2 years ago
ETL is now documented here.
This work was begun in 2020, with the intention of utilizing AWS Batch to execute ETL, using the existing Docker image. Provisioning of the periodic job(s) would be controlled by manage
command(s).
See: https://github.com/dssg/appy-reviews/compare/jsl/auto-etl
manage
command(s) could:
./manage.py ...
) – (i.e. this logic should remain in the argcmdr manage.py
, and the Batch job should likely be left generic).
but, at least when run in the cloud, should likely be decorated…
chamber
(i.e. params/secrets should be moved to AWS Parameter Store)and with the following for reporting:
slack-report --channel appy \ /usr/bin/time \ annotate-output \ ./manage.py $@
ETL should be run periodically by an automated process to (re)load data into the database.
In 2019, ETL was run during the application stage from the management CLI, with variations of the following:
…and in the review stage:
Note, however, that Appy does not ship with the repo's
manage.py
. The above CLI commands wrap and document the applicable underlying commands, (run throughdocker run --rm --user webapp -eDATABASE_URL -eWUFOO_API_KEY appyreviews_web
):and: