greatexpectationslabs / ge_tutorials

Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.
167 stars 84 forks source link

mixed concerns and dependency management issues #2

Open georgim0 opened 4 years ago

georgim0 commented 4 years ago

Hi there,

Excited to see this tutorial as it's something we've been struggling in the past.

I've tried mixing airflow, dbt and ge in the past.. This approach has 2 issues:

Here's the approach we've taken:

Could you validate our approach please? Is GE used the way it was designed to?

eugmandel commented 4 years ago

The approach sounds good to me. You should create one GE DataContext (project) created - probably in the repo with the dbt models. Then the airflow task that invokes validation loads the project's config from the models repo. You can parameterize the credentials of the database (your datasource) using env variables.