However, this makes impossible to interact with the data context before or after the execution.
If this self.data_context is initiated in the __init__() method, the user could interact with this object in the pre_execute() or post_execute() methods of airflow BaseOperator.
A possible use case, for example, is to add ExpectationsSuites on runtime using an InMemoryStoreBackend Expectation store?
def pre_execute(self, context: Any):
"""
Create and add an expectation suite to the in-memory DataContext.
"""
suite = self.data_context.create_expectation_suite(suite_name=suite_name, overwrite_existing=True)
# Add expectations
# Here we'll add a simple expectation as an example
suite.add_expectation(
expectation_type="expect_table_row_count_to_be_between",
kwargs={
"min_value": 1,
"max_value": 1000000
}
)
# Save the suite to the DataContext's in-memory expectations store
self.data_context.save_expectation_suite(suite)
Right now, the
self.data_context
object is initialized within theexecute
method of the airflowBaseOperator
.This is done in: https://github.com/astronomer/airflow-provider-great-expectations/blob/0863df8edc0d4fbafc8614d28af3a1317ba255c7/great_expectations_provider/operators/great_expectations.py#L586
However, this makes impossible to interact with the data context before or after the execution.
If this
self.data_context
is initiated in the__init__()
method, the user could interact with this object in thepre_execute()
orpost_execute()
methods of airflowBaseOperator
.A possible use case, for example, is to add ExpectationsSuites on runtime using an
InMemoryStoreBackend
Expectation store?