Open charlie-costanzo opened 8 months ago
Larger Job overview: Break up jobs into buckets:
Harder
Breaking up of tasks is not a big deal -> Create new transform models Modify the daily dag, on the whole,run turn into multiple transform tasks -> then sequential rather than simultaneous
After Littlepay's recent adjustment to their publishing cadence to better suit our analytics needs, we found that the new publishing time was too late for our
transform_warehouse
DAG start time and was making data stale. In #3290, we move thetransform_warehouse
DAG start time forward 4 hours ( from 10:00 to 14:00 UTC) to improve the data freshness, but this makes all data transformations happen later in the morning which is not ideal.We need to break apart the
transform_warehouse
DAG so that models that need to be run later in the morning (payments) are run at 14:00, and all of the other models run at the previous time (10:00 UTC).A notes doc for an initial meeting about this effort is available here, but the project was deprioritized in favor of handoff tasks following that first meeting.