moj-analytical-services / pydbtools

Python version of dbtools
https://moj-analytical-services.github.io/pydbtools/
11 stars 2 forks source link

CCDE-300: Add CRON / AIRFLOW job to remove temp databases #53

Closed isichei closed 1 year ago

isichei commented 2 years ago

On our platform, when pydbtools creates a temp database it prefixes the name of the glue schema as mojap_de_temp_<ts> currently these are not cleaned up.

You should write a python script using AWS wrangler that gets the database names and filters it by the prefix mojap_de_temp. Then any TS > 24 hours from the current script run TS is deleted. Then put this script on a daily run at say like 4am or something on Airflow.

mratford commented 2 years ago

https://github.com/moj-analytical-services/airflow-clean-temp-dbs/pull/1