greatexpectationslabs / ge_tutorials

Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.
167 stars 84 forks source link

Dockerization for easy setup #3

Open nehiljain opened 4 years ago

nehiljain commented 4 years ago

I think it will be a good idea to dockerize the setup which will make it easier to get started and run the repo. Right it requires a bunch of different steps and configs to be setup which can be done inside a docker-compose. If this is something that is of interest to you, I can work on it.

I idea will be to run the repo in a docker and run a Postgres database in another container which airflow, ge and dbt can connect with static address.

eugmandel commented 4 years ago

@nehiljain I love the idea! If the goal is to give the users the absolutely easiest way to run the example, would it be better to package everything in one image - Airflow and Postgres?

Alternatively, a Docket Compose setup that would create two containers?

nehiljain commented 4 years ago

It's not best practice for people to do it all in one image. Also, I see this repo as a starting point for a lot of people to setup their first dag with airflow, dbt and GE. So I wanna make sure we give them all the required tools to build on top of this.

nehiljain commented 4 years ago

https://github.com/superconductive/ge_tutorials/pull/4 This is my first attempt at it. Please provide some feedback regarding the config variables required for GE to run, specifically https://github.com/superconductive/ge_tutorials/pull/4/files#diff-8a35445100c39c9076dc658c650943eeR16 this file. Is there a better way to do it?