njfritter / poc-data-pipelines

Proof-of-Concept (POC) Data Pipelines for various use cases such as data streaming/ingestion, batch data processing, orchestration and storage. Includes technologies such as Apache Airflow, Apache Spark, Apache Kafka, AWS, Python and more
0 stars 0 forks source link

Choose Orchestration Engine + Add Batch ETL jobs via DBT #15

Open njfritter opened 7 months ago

njfritter commented 7 months ago

Choose orchrestration tool to implement batch jobs. Will most likely be Airflow, but could also consider other options like Prefect or Databricks.

Use dbt to implement Batch ETL jobs, including pre-aggregation of historical Coinbase data.

njfritter commented 7 months ago

Airflow resources: