great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.97k stars 1.54k forks source link

PostgreSQL Table asset throwing “InvalidExpectationConfigurationError” while validating in azure databricks #10086

Closed DineshBaratam-5 closed 2 months ago

DineshBaratam-5 commented 4 months ago

Describe the bug Using Azure databricks and great expectations to perform data testing on PostgreSQL Server currently. Recently I started implementing SQL Data Asset in place of In memory data asset to perform our validations. While using table asset I am getting a new error as InvalidExpectationConfigurationError: Metric (‘column_values.nonnull.unexpected_count’, ‘8cbdcf4d8ebf45f01f38a392cc8da758’, ()) is not available for validation.

To Reproduce PFB the code snippet I executed in azure databricks cloud environment

contextDirecotry = "/dbfs/great_expectations/"
context = gx.get_context(context_root_dir=contextDirecotry)

connection_url = sa.URL.create(
    "postgresql+psycopg2",
    username=targetDbUserName,
    password=targetDbPassword,
    host=targetSQLServerName,
    database=targetDataBaseName,
    port=targetSQLPort
)
connectionString = connection_url.render_as_string(hide_password=False)

datasource = context.sources.add_or_update_postgres(name="postgresqlDataSource", connection_string=connectionString)
asset_name = "ProductAsset"
dataasset = datasource.add_table_asset(name=asset_name, table_name=targetTableName)

validator.expect_column_values_to_not_be_null(column="ProductID")
validator.expect_column_values_to_be_between(column="StandardCost", min_value=0, max_value=100000)
validator.save_expectation_suite(discard_failed_expectations=False)`

After executing above code everything is working fine till creation of validator and an error is being thrown at the line "validator.expect_column_values_to_not_be_null(column="ProductID")". PFB the total error trace.

`InvalidExpectationConfigurationError: Metric ('column_values.nonnull.unexpected_count', 'fda39409dabeab873ee95301ea648db7', ()) is not available for validation of { "expectation_type": "expect_column_values_to_not_be_null", "kwargs": { "column": "ProductID", "batch_id": "postgresqlDataSource-ProductAsset" }, "meta": {} }. Please check your configuration.

InvalidExpectationConfigurationError Traceback (most recent call last) File , line 1 ----> 1 validator.expect_column_values_to_not_be_null(column="ProductID") 2 validator.expect_column_values_to_be_between(column="StandardCost", min_value=0, max_value=100000) 3 validator.save_expectation_suite(discard_failed_expectations=False)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/validator/validator.py:590, in Validator.validate_expectation..inst_expectation(*args, **kwargs) 584 validation_result = ExpectationValidationResult( 585 success=False, 586 exception_info=exception_info, 587 expectation_config=configuration, 588 ) 589 else: --> 590 raise err 592 if self._include_rendered_content: 593 validation_result.render()

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/validator/validator.py:553, in Validator.validate_expectation..inst_expectation(*args, **kwargs) 549 validation_result = ExpectationValidationResult( 550 expectation_config=copy.deepcopy(expectation.configuration) 551 ) 552 else: --> 553 validation_result = expectation.validate( 554 validator=self, 555 evaluation_parameters=self._expectation_suite.evaluation_parameters, 556 data_context=self._data_context, 557 runtime_configuration=basic_runtime_configuration, 558 ) 560 # If validate has set active_validation to true, then we do not save the config to avoid 561 # saving updating expectation configs to the same suite during validation runs 562 if self._active_validation is True:

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/expectations/expectation.py:1314, in Expectation.validate(self, validator, configuration, evaluation_parameters, interactive_evaluation, data_context, runtime_configuration) 1305 self._warn_if_result_format_config_in_expectation_configuration( 1306 configuration=configuration 1307 ) 1309 configuration.process_evaluation_parameters( 1310 evaluation_parameters, interactive_evaluation, data_context 1311 ) 1312 expectation_validation_result_list: list[ 1313 ExpectationValidationResult -> 1314 ] = validator.graph_validate( 1315 configurations=[configuration], 1316 runtime_configuration=runtime_configuration, 1317 ) 1318 return expectation_validation_result_list[0]

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/validator/validator.py:1089, in Validator.graph_validate(self, configurations, runtime_configuration) 1082 evrs = self._catch_exceptions_in_failing_expectation_validations( 1083 exception_traceback=exception_traceback, 1084 exception=err, 1085 failing_expectation_configurations=[configuration], 1086 evrs=evrs, 1087 ) 1088 else: -> 1089 raise err 1091 return evrs

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/validator/validator.py:1073, in Validator.graph_validate(self, configurations, runtime_configuration) 1070 try: 1071 runtime_configuration_default = copy.deepcopy(runtime_configuration) -> 1073 result = configuration.metrics_validate( 1074 metrics=resolved_metrics, 1075 execution_engine=self._execution_engine, 1076 runtime_configuration=runtime_configuration_default, 1077 ) 1078 evrs.append(result) 1079 except Exception as err:

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/core/expectation_configuration.py:1494, in ExpectationConfiguration.metrics_validate(self, metrics, runtime_configuration, execution_engine, **kwargs) 1492 expectation_impl: Type[Expectation] = self._get_expectation_impl() 1493 # noinspection PyCallingNonCallable -> 1494 return expectation_impl(self).metrics_validate( 1495 metrics=metrics, 1496 runtime_configuration=runtime_configuration, 1497 execution_engine=execution_engine, 1498 )

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/expectations/expectation.py:1088, in Expectation.metrics_validate(self, metrics, configuration, runtime_configuration, execution_engine, **kwargs) 1082 runtime_configuration["result_format"] = validation_dependencies.result_format 1084 validation_dependencies_metric_configurations: List[ 1085 MetricConfiguration 1086 ] = validation_dependencies.get_metric_configurations() -> 1088 _validate_dependencies_against_available_metrics( 1089 validation_dependencies=validation_dependencies_metric_configurations, 1090 metrics=metrics, 1091 configuration=configuration, 1092 ) 1094 metric_name: str 1095 metric_configuration: MetricConfiguration

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/expectations/expectation.py:3766, in _validate_dependencies_against_available_metrics(validation_dependencies, metrics, configuration) 3764 for metric_config in validation_dependencies: 3765 if metric_config.id not in metrics: -> 3766 raise InvalidExpectationConfigurationError( 3767 f"Metric {metric_config.id} is not available for validation of {configuration}. Please check your configuration." 3768 )

InvalidExpectationConfigurationError: Metric ('column_values.nonnull.unexpected_count', 'fda39409dabeab873ee95301ea648db7', ()) is not available for validation of { "expectation_type": "expect_column_values_to_not_be_null", "kwargs": { "column": "ProductID", "batch_id": "postgresqlDataSource-ProductAsset" }, "meta": {} }. Please check your configuration.`

Expected behavior Gx workflow should execute, and the validator should save the expectation suite after validating the given expectation.

Environment:

molliemarie commented 2 months ago

Hello @DineshBaratam-5. With the launch of Great Expectations Core (GX 1.0), we are closing old issues posted regarding previous versions. Moving forward, we will focus our resources on supporting and improving GX Core (version 1.0 and beyond). If you find that an issue you previously reported still exists in GX Core, we encourage you to resubmit it against the new version. With more resources dedicated to community support, we aim to tackle new issues swiftly. For specific details on what is GX-supported vs community-supported, you can reference our integration and support policy.

To get started on your transition to GX Core, check out the GX Core quickstart (click “Full example code” tab to see a code example).

You can also join our upcoming community meeting on August 28th at 9am PT (noon ET / 4pm UTC) for a comprehensive rundown of everything GX Core, plus Q&A as time permits. Go to https://greatexpectations.io/meetup and click “follow calendar” to follow the GX community calendar.

Thank you for being part of the GX community and thank you for submitting this issue. We're excited about this new chapter and look forward to your feedback on GX Core. 🤗