great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.8k stars 1.51k forks source link

Assets of type query not working for Oracle datasource #8618

Closed hjannasch closed 2 weeks ago

hjannasch commented 1 year ago

Describe the bug As our database stack if mainly Oracle based. I tried to setup an expectation suite using a fluent oracledb sql datasource, but using assets of type 'query' it not working. Well I know you are not supporting Oracle officially, but do you have any idea on this?

To Reproduce

fluent_datasources:
  dm:
    type: sql
    assets:
      anrede:
        type: table
        order_by: []
        batch_metadata: {}
        table_name: anrede
        schema_name:
      person_query:
        type: query
        order_by: []
        batch_metadata: {}
        query: select * from PERSON
    connection_string: oracle+oracledb://<user>:<pwd>@localhost:1521/xe`

The table asset works fine, but the query asset generates a none valid SQL. When the person_query asset is used to create a new validator and suite:

import great_expectations as gx

context = gx.get_context()
data_asset = context.get_datasource("dm").get_asset("person_query")
batch_request = data_asset.build_batch_request()

validator = context.get_validator(
    batch_request=batch_request,
    expectation_suite_name="nat_query_expectation_suite",
)
validator.head()

This leads to the error: DatabaseError: (oracledb.exceptions.DatabaseError) ORA-00907: missing right parenthesis [SQL: CREATE GLOBAL TEMPORARY TABLE gx_temp_5da3476a ON COMMIT PRESERVE ROWS AS SELECT FROM (SELECT from PERSON FROM DUAL) anon_1 WHERE 1 = 1]

Expected behavior Generate valid SQL, it looks like ... FROM DUAL is too much here.

Environment (please complete the following information):

great-expectations==0.17.12 sqlalchemy==2.0.20 oracledb==1.4.0

hjannasch commented 1 year ago

Here is the trace. traceback.log

Kilo59 commented 1 year ago

@hjannasch You can try changing the value of create_temp_table = False when defining the Datasource but I suspect you'll still end up with some version of a bad identifier issue when you run a checkpoint.

Please report back if this does or doesn't work.

fluent_datasources:
  dm:
    type: sql
    create_temp_table: false
    assets:
      anrede:
        type: table
        order_by: []
        batch_metadata: {}
        table_name: anrede
        schema_name:
      person_query:
        type: query
        order_by: []
        batch_metadata: {}
        query: select * from PERSON
    connection_string: oracle+oracledb://<user>:<pwd>@localhost:1521/xe`
hjannasch commented 1 year ago

@Kilo59 Thanks for your reply.

With create_temp_table = False the SQL is reduced to: SELECT * \nFROM (SELECT * \nFROM (SELECT * from PERSON FROM DUAL) anon_2 \nWHERE 1 = 1

Which is fine, but still ... from PERSON FROM DUAL which is invalid SQL

Kilo59 commented 4 months ago

@hjannasch Sorry for the very late response. It's unlikely that we'll get around to fixing this issue for Oracle specifically.

Something you could try is creating a View for query and then adding it as a TableAsset.

import great_expectations as gx

context = gx.get_context()
oracle_ds = context.sources.add_sql("my_oracle_datasource", connection_string=...)
my_view_asset = oracle_ds.add_table("my_view", table="NAME_OF_YOUR_VIEW") 
molliemarie commented 2 weeks ago

Hello @hjannasch. With the upcoming launch of Great Expectations Core (GX 1.0), we are closing old issues posted regarding previous versions. Moving forward, we will focus our resources on supporting and improving GX Core (version 1.0 and beyond). If you find that an issue you previously reported still exists in GX Core, we encourage you to resubmit it against the new version. With more resources dedicated to community support, we aim to tackle new issues swiftly. For specific details on what is GX-supported vs community-supported, you can reference our integration and support policy.

To get started on your transition to GX Core, check out the GX Core quickstart (click “Full example code” tab to see a code example).

You can also join our upcoming community meeting on August 28th at 9am PT (noon ET / 4pm UTC) for a comprehensive rundown of everything GX Core, plus Q&A as time permits. Go to https://greatexpectations.io/meetup and click “follow calendar” to follow the GX community calendar.

Thank you for being part of the GX community and thank you for submitting this issue. We're excited about this new chapter and look forward to your feedback on GX Core. 🤗