USF-IMARS / imars-etl

:cloud: Tools for `extract` and `load` for IMaRS ETL (Extract, Transform, Load) operations
0 stars 0 forks source link

"drivers" as airflow hooks? #16

Closed 7yl4r closed 5 years ago

7yl4r commented 6 years ago

I have started splitting things into "drivers" here. These could go even further by implementing them as "hooks". This might provide additional airflow features and should help guide/boost development of more drivers.

7yl4r commented 6 years ago

Actually, airflow hooks don't break out of airflow easily since they are tied to Connections.

Still, I wonder if it would be better for imars_dags to connect up to imars-etl stuff through the hooks instead of through this little package... Probably not worth maintaining two interfaces to the backend(s) I guess. I.E. one in airflow and one here for manually loading/extracting data via bash/python.

7yl4r commented 6 years ago

Actually, this would be doable and potentially beneficial (a long time from now):

  1. add airflow as dependency here.
  2. use airflow hooks to replace "drivers". login details still hardcoded here.
  3. allow API to pass in Connection object(s) to replace the hardcoded stuff.
  4. modify imars_dags to use the connection param implemented in (3).
  5. you can now leverage other hooks to replace/replicate extract/load and metadata queries

So... I'll open it back up I guess, but it is going waaaaaaay back in the queue. Seems like a lot of work for a hypothetical gain.

7yl4r commented 5 years ago

I think implementation of this would solve (4) from USF-IMARS/imars_dags#85 (airflow-test instance extract/load separation).

7yl4r commented 5 years ago

This is done on master. Not yet merged into prod because I want to double-check airflow connections are all set up.