kreuzwerker / kreuzlaker

11 stars 2 forks source link

Add great expectations as a data validation framework #10

Open fabdy opened 1 year ago

fabdy commented 1 year ago

https://greatexpectations.io/ is a framework to test expectations against data ("Column should never be NULL", "Values should be between x and y",...) which is great for checking that input data and also output data keep the quality you expect.

https://docs.greatexpectations.io/docs/deployment_patterns/how_to_use_great_expectations_in_aws_glue/ https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/database/athena/ 

Tasks

Add it into the dev stack for business transformations Run it during the dbt (and glue?) runs

DoD:

GE is run automatically during dbt runs