greatexpectationslabs / ge_tutorials

Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.
166 stars 83 forks source link

Unable to launch Airflow DBT tutorial using Docker #27

Open cbuffett opened 2 years ago

cbuffett commented 2 years ago

Out of the box, there appear to be several issues with running the Airflow DBT examples using Docker.

  1. The 1.0.x release of dbt should be installed using pip install dbt-core or pip install dbt-<connector>, e.g., pip install dbt-postgres. This applies both to the Dockerfile and to requirements.txt
  2. Dependency resolution complains about a few packages, ultimately resulting in Airflow failing to start up. I've been able to resolve this by pinning dbt-postgres<1.0.0, wtforms==2.3.3 , and werkzeug<1.0.0 in requirements.txt, which is required by Airflow 1.10.9.
  3. The dbt_project.yml is missing the config-version: 2 setting, which prevents DAGs from executing
  4. There is a typo in airflow/ge_tutorials_dag_with_great_expectations.py, though it looks like https://github.com/superconductive/ge_tutorials/pull/16 addresses this
    
    webserver_1     | Traceback (most recent call last):
    webserver_1     |   File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 243, in process_file
    webserver_1     |     m = imp.load_source(mod_name, filepath)
    webserver_1     |   File "/usr/local/lib/python3.7/imp.py", line 171, in load_source
    webserver_1     |     module = _load(spec)
    webserver_1     |   File "<frozen importlib._bootstrap>", line 696, in _load
    webserver_1     |   File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
    webserver_1     |   File "<frozen importlib._bootstrap_external>", line 724, in exec_module
    webserver_1     |   File "<frozen importlib._bootstrap_external>", line 860, in get_code
    webserver_1     |   File "<frozen importlib._bootstrap_external>", line 791, in source_to_code
    webserver_1     |   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
    webserver_1     |   File "/usr/local/airflow/dags/ge_tutorials_dag_with_great_expectations.py", line 21
    webserver_1     |     "owner":` "Airflow",
    webserver_1     |             ^
    webserver_1     | SyntaxError: invalid syntax```