devinit / ddw-analyst-ui

The Development Data Warehouse
http://ddw.devinit.org/
GNU General Public License v3.0
2 stars 1 forks source link

Refactor iati_transactions code into a general-purpose reload function that could create multiple tables #772

Open akmiller01 opened 2 years ago

akmiller01 commented 2 years ago

At present we have the following data update scripts:

  1. iati.sh
  2. iati_datastore.sh
  3. iati_registry_refresh.sh
  4. iati_transactions.sh
  5. iati_transactions_retry.sh

We should clean up, simplify, and refactor these scripts such that they:

  1. Run Python/iati_refresh.py, marking IATI datasets as either new, modified, or stale in the iati_registry_metadata table.
  2. Run a new script based on Python/iati_transactions.py, which is capable of modularly modifying multiple IATI-based tables at the moment new data is loaded (IATI nomenclature would call this iati_reload.py).
  3. Run auxiliary scripts such as Python/iati_rhfp.py which rely on the entire data structure and don't need to be run strictly on new/modified files.

The idea is that instead of running iati_transactions.sh run once, at which point the information about whether a dataset is new is destroyed, any number of tables can be progressively built during the reload process just as iati_transactions.py is.

wakibi commented 2 years ago

Will depend on #773

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.