great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.87k stars 1.53k forks source link

URN Parameter defined in the checkpoint is passed as a string to the expectation during runtime #8276

Closed Ranji-1712 closed 1 month ago

Ranji-1712 commented 1 year ago

Describe the bug In the checkpoint configuration (like below) , when passing the query parameter which is configured in the sqlalchemy_query_store to the evaluation parameter and also calling the same in the expectation suite (added below) . While running the checkpoint instead of the query results the parameter value is passed as it is to the validation .

Checkpoint Config:

name: validation_checkpoint
config_version: 1
class_name: Checkpoint
module_name: great_expectations.checkpoint
template_name:
run_name_template: "%Y-%m-%d %H:%M:%S_validation_result"
action_list:
  - name: store_validation_result
    action:
      class_name: StoreValidationResultAction
  - name: store_evaluation_params
    action:
      class_name: StoreEvaluationParametersAction
  - name: update_data_docs
    action:
      class_name: UpdateDataDocsAction
validations:
  - name: vaidation
    batch_request:
      data_asset_name: data_asset_name
      data_connector_name: data_connector_name
      datasource_name: datasource_name
    expectation_suite_name: expect_table_row_count_to_equal
    evaluation_parameters:
      source_count: { "$PARAMETER": "urn:great_expectations:stores:sqlalchemy_query_store:source_count" }

Expectation Suite with Parameter defined in checkpoint

{
  "expectation_suite_name": "expect_table_row_count_to_equal",
  "expectations": [
    {
      "expectation_type": "expect_table_row_count_to_equal",
      "kwargs": {
        "value": {"$PARAMETER": "source_count"},
        "result_format": "COMPLETE",
        "catch_exceptions": true
      },
      "meta":{
        "notes": {
            "format": "markdown",
            "content": ["**Validate the count**"]
              }
      }
    }
  ]
} 

Stores Config from great_expectations.yml file :

sqlalchemy_query_store:
    class_name: SqlAlchemyQueryStore
    credentials:
      connection_string: ${connection_string}
    queries:
      source_count:
        query: select count(*) as source_count from schema.table;
        return_type: scalar

Error : Must have exactly {‘$PARAMETER’: ‘urn:great_expectations:stores:sqlalchemy_query_store:source_count’} rows

great_expectations.yml config

# Welcome to Great Expectations! Always know what to expect from your data.
#
# Here you can define datasources, batch kwargs generators, integrations and
# more. This file is intended to be committed to your repo. For help with
# configuration please:
#   - Read our docs: https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview/#2-configure-your-datasource
#   - Join our slack channel: http://greatexpectations.io/slack

# config_version refers to the syntactic version of this config file, and is used in maintaining backwards compatibility
# It is auto-generated and usually does not need to be changed.
config_version: 3.0

# Datasources tell Great Expectations where your data lives and how to get it.
# You can use the CLI command `great_expectations datasource new` to help you
# add a new datasource. Read more at https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview
datasources:
  lake_prod_data_source:
    name: lake_data_source
    class_name: Datasource
    module_name: great_expectations.datasource
    data_connectors:
      lake_prod_data_connector:
        class_name: ConfiguredAssetSqlDataConnector
        assets:
          luna_lazypay_transactions_data_asset:
            schema_name: luna_lazypay
            table_name: transactions
        name: lake_prod_data_connector
        module_name: great_expectations.datasource.data_connector
    execution_engine:
      module_name: great_expectations.execution_engine
      create_temp_table: false
      class_name: SqlAlchemyExecutionEngine
      connection_string: ${lake_prod_connection_string}
config_variables_file_path: uncommitted/config_variables.yml

# The plugins_directory will be added to your python path for custom modules
# used to override and extend Great Expectations.
plugins_directory: plugins/

stores:
  # Stores are configurable places to store things like Expectations, Validations
# Data Docs, and more. These are for advanced users only - most users can simply
# leave this section alone.
#
# Three stores are required: expectations, validations, and
# evaluation_parameters, and must exist with a valid store entry. Additional
# stores can be configured for uses such as data_docs, etc.

#luna_lazypay.transactions
  sqlalchemy_query_store_1:
    class_name: SqlAlchemyQueryStore
    credentials:
      connection_string: ${connection_string}
    queries:
      source_total_count:
        query: select count(*) as source_count from <schema name>.<table_name>;
        return_type: scalar

  expectations_store:
    class_name: ExpectationsStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      base_directory: expectations/

  validations_store:
    class_name: ValidationsStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      base_directory: uncommitted/validations/

  evaluation_parameter_store:
    class_name: EvaluationParameterStore
  checkpoint_store:
    class_name: CheckpointStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      suppress_store_backend_id: true
      base_directory: checkpoints/

  profiler_store:
    class_name: ProfilerStore
    store_backend:
      class_name: TupleFilesystemStoreBackend
      suppress_store_backend_id: true
      base_directory: profilers/

expectations_store_name: expectations_store
validations_store_name: validations_store
evaluation_parameter_store_name: evaluation_parameter_store
checkpoint_store_name: checkpoint_store

data_docs_sites:
  # Data Docs make it simple to visualize data quality in your project. These
  # include Expectations, Validations & Profiles. The are built for all
  # Datasources from JSON artifacts in the local repo including validations &
  # profiles from the uncommitted directory. Read more at https://docs.greatexpectations.io/docs/terms/data_docs
  local_site:
    class_name: SiteBuilder
    show_how_to_buttons: true
    store_backend:
      class_name: TupleFilesystemStoreBackend
      base_directory: uncommitted/data_docs/local_site/
    store_backend_parameters:
      class_name: TupleStoreBackend
      base_directory: uncommitted/config_variables/
      prefix: "parameters"
      sep: "__"
    site_index_builder:
      class_name: DefaultSiteIndexBuilder

anonymous_usage_statistics:
  data_context_id: 20556d2a-71ce-4407-b376-2085a8454936
  enabled: true
include_rendered_content:
  expectation_suite: false
  expectation_validation_result: false
  globally: false
notebooks:

Expected behavior Instead of the passing the same URN parameter defined in the evaluation parameter , expected to pass the respective query results during the checkpoint run / validation

Environment (please complete the following information):

HaebichanGX commented 1 year ago

Hi @Ranji-1712 can you please attach your code so that we can replicate your situation?

Ranji-1712 commented 1 year ago

@HaebichanGX There is no much code . Im just defining the great_expectations.yml as mentioned above , expectation suite and checkpoint . Then calling the checkpoint run operation

Ranji-1712 commented 1 year ago

@HaebichanGX This how i execute the checkpoint defined above

import great_expectations as gx
from great_expectations.core.batch import Batch, BatchRequest, RuntimeBatchRequest

# context = gx.get_context()
context = gx.get_context(
    context_root_dir='/Users/ranjith.kannusamy/Automation/github/dp-qa-great-expectations/great_expectations/'
)
# context
checkpoint_name = 'validation_checkpoint'
checkpoint = context.get_checkpoint(name=checkpoint_name)
checkpoint
checkpoint_result = context.run_checkpoint(checkpoint_name=checkpoint_name)
HaebichanGX commented 1 year ago

Great thank you for sharing, it's good to have all the information. I have put this in our backlog to get it reviewed. Thank you!

nnrane commented 11 months ago

Any update in the above problem.? Facing the same issues.

molliemarie commented 1 month ago

Hello @Ranji-1712. With the upcoming launch of Great Expectations Core (GX 1.0), we are closing old issues posted regarding previous versions. Moving forward, we will focus our resources on supporting and improving GX Core (version 1.0 and beyond). If you find that an issue you previously reported still exists in GX Core, we encourage you to resubmit it against the new version. With more resources dedicated to community support, we aim to tackle new issues swiftly. For specific details on what is GX-supported vs community-supported, you can reference our integration and support policy.

To get started on your transition to GX Core, check out the GX Core quickstart (click “Full example code” tab to see a code example).

You can also join our upcoming community meeting on August 28th at 9am PT (noon ET / 4pm UTC) for a comprehensive rundown of everything GX Core, plus Q&A as time permits. Go to https://greatexpectations.io/meetup and click “follow calendar” to follow the GX community calendar.

Thank you for being part of the GX community and thank you for submitting this issue. We're excited about this new chapter and look forward to your feedback on GX Core. 🤗