great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.97k stars 1.54k forks source link

[BUG] Exception during validation of ExpectColumnValuesToBeInSet #10371

Closed Terma1 closed 1 month ago

Terma1 commented 1 month ago

Describe the bug I created a test expectation and tried to validate it , but i got this exception. My data: data = { "Letters": ["AB","BB"] }

then i created expectation: expectation = gx.expectations.ExpectColumnValuesToBeInSet( column="letters", value_set = ["AB","BB"] )

ExpectColumnValuesToBeInSet(id=None, meta=None, notes=None, result_format=<ResultFormat.BASIC: 'BASIC'>, description=None, catch_exceptions=True, rendered_content=None, batch_id=None, row_condition=None, condition_parser=None, column='letters', mostly=1.0, value_set=['AB', 'BB'])

and tried to run the Expectation on the Batch of data.

batch.validate(expectation)

To Reproduce traceback: "exception_info": { "exception_traceback": "Traceback (most recent call last):\n File \"/local_disk0/.ephemeral_nfs/envs/pythonEnv-ee0ab0a9-1d18-41ef-af90-fd5a799d88b4/lib/python3.10/site-packages/great_expectations/validator/validator.py\", line 648, in graph_validate\n result = expectation.metrics_validate(\n File \"/local_disk0/.ephemeral_nfs/envs/pythonEnv-ee0ab0a9-1d18-41ef-af90-fd5a799d88b4/lib/python3.10/site-packages/great_expectations/expectations/expectation.py\", line 1064, in metrics_validate\n _validate_dependencies_against_available_metrics(\n File \"/local_disk0/.ephemeral_nfs/envs/pythonEnv-ee0ab0a9-1d18-41ef-af90-fd5a799d88b4/lib/python3.10/site-packages/great_expectations/expectations/expectation.py\", line 2754, in _validate_dependencies_against_available_metrics\n raise InvalidExpectationConfigurationError( # noqa: TRY003\ngreat_expectations.exceptions.exceptions.InvalidExpectationConfigurationError: Metric ('column_values.nonnull.unexpected_count', '578fe538ff4b7c5a0e8c361ada4ba88d', ()) is not available for validation of configuration. Please check your configuration.\n", "exception_message": "Metric ('column_values.nonnull.unexpected_count', '578fe538ff4b7c5a0e8c361ada4ba88d', ()) is not available for validation of configuration. Please check your configuration.", "raised_exception": true }

Environment (please complete the following information): Databricks, GX version: 1.0.2

adeola-ak commented 1 month ago

please begin by ensuring that your Batch Definition was able to read in data and return a populated Batch: batch.head()

do you get data back when you run that command? also what kind of data are you connecting to?

also, do you mind sharing why you included this:

ExpectColumnValuesToBeInSet(id=None, meta=None, notes=None, result_format=<ResultFormat.BASIC: 'BASIC'>, description=None, catch_exceptions=True, rendered_content=None, batch_id=None, row_condition=None, condition_parser=None, column='letters', mostly=1.0, value_set=['AB', 'BB'])

i'm thinking we may just need to work on your configuration

Utkarsh-Krishna commented 1 month ago

I am facing similar exception. Raised an issue "Exception during validation of ExpectColumnValuesToNotBeNull #10410"

adeola-ak commented 1 month ago

Since I haven't heard back and I'm unable to replicate the issue, I'll go ahead and close it. Feel free to reach out again, sharing your complete configuration if you're still experiencing the problem, thanks!