greatexpectationslabs / ge_tutorials

Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.
167 stars 84 forks source link

Docker dag run ge error - invalid config version (1.0) #7

Open torsjonas opened 4 years ago

torsjonas commented 4 years ago

Nice with the Docker tutorial for convenience! However the dockerfile probably got out of date since I got an error from the first task of the dag with ge:

[2020-06-27 11:21:35,173] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: ge_tutorials_dag_with_ge.task_validate_source_data 2020-06-27T11:21:27.511382+00:00 [running]> 7617b740a5b3
[2020-06-27 11:21:35,247] {{taskinstance.py:1128}} ERROR - You appear to have an invalid config version (1.0).
    The version number must be at least 2. Please see the migration guide at https://docs.greatexpectations.io/en/latest/how_to_guides/migrating_versions.html
Shinnnyshinshin commented 4 years ago

Yes you're right @torsjonas it looks like the Dockerfile needs to be updated to the latest version GE. Could you do a manual upgrade to the latest version (the latest version is 0.11.6 as of this message) and see if the issue is resolved?

We will be upgrading the dockerfile as well

torsjonas commented 4 years ago

Sorry for late reply, I've been out and away for a while. I aim to do this after the weekend.

nehiljain commented 4 years ago

I am having the same issue. I think GE updated the config version which caused this bug. @Shinnnyshinshin I think GE installed in dockerfile is the latest version (https://github.com/superconductive/ge_tutorials/blob/master/Dockerfile#L85) so that shouldn't be the main issue. But I think the problem is coming from https://github.com/superconductive/ge_tutorials/blob/master/great_expectations_projects/final/great_expectations/great_expectations.yml#L9

torsjonas commented 4 years ago

Indeed there was a release of great_expecations that requires another config version. As a temporary fix until the config file has been migrated to version 2.0 in this repo, I managed to get it working with Docker by

  1. Restricting the great_expecations version to 0.10.12 in the Dockerfile && pip install great_expectations==0.10.12 \
  2. Removing this line https://github.com/superconductive/ge_tutorials/blob/master/docker-compose.yml#L39 which mounts the requirements.txt file to the webserver service (otherwise it looks like the entrypoint.sh file is using that file if it exists to do additional installs that overrides the installs that are already done explicitly in the Dockerfile here https://github.com/superconductive/ge_tutorials/blob/master/Dockerfile#L85).

To me it seems more clear to install everything needed by the Docker container with install statements directly in the Dockerfile and not involve the requirements.txt via the container entrypoint script.