great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.97k stars 1.54k forks source link

Failing `test_connection` in `TableAsset.test_connection` for uppercase schema name defined in SQL Server / mssql #10499

Open amirulmenjeni opened 3 weeks ago

amirulmenjeni commented 3 weeks ago

Using GX Core version: 1.1.1

Currently I'm not able to create a table data asset from the SQLDatasource's add_table_asset method when the schema of the table I'm trying to connect to is defined in the database in the uppercase form (e.g., MY_SCHEMA):

datasource.add_table_asset(
    name="asset-name", 
    schema_name="S_EC",
    table_name="my_table",
)

Instead I got the following error:

great_expectations.datasource.fluent.interfaces.TestConnectionError: Attempt to connect to table: "my_schema.my_table" failed because the schema "my_schema" does not exist

Looking at the lines below it seems that the self.schema_name not in inspector.get_schema_names() check doesn't do case insensitive comparison.

https://github.com/great-expectations/great_expectations/blob/f9aa879a3d3ef0750bd72a24a003ea5631a6f29e/great_expectations/datasource/fluent/sql_datasource.py#L908-L912

Attempting to bracket the schema with quote characters (e.g., 'MY_SCHEMA') does persist the uppercase, but the test fails as well, complaining:

great_expectations.datasource.fluent.interfaces.TestConnectionError: Attempt to connect to table: ""MY_SCHEMA".my_table" failed because the schema ""MY_SCHEMA"" does not exist

I can still create query asset with add_query_asset method, though.

Kilo59 commented 3 weeks ago

I believe the SQLAlchemy behavior is to always treat lowercase identifiers as cases insensitive. Try using a lowercase schema_name (even though your actual schema is uppercase).

adeola-ak commented 3 weeks ago

hi @amirulmenjeni can you confirm if using a lowercase schema_name no longer resulted in an error?

amirulmenjeni commented 2 weeks ago

@adeola-ak, I've tried that as well. Sorry I failed to mention that in the original post. But that also didn't work.

adeola-ak commented 2 weeks ago

@amirulmenjeni just to confirm, your schema was created with quotations to preserve the uppercase form? therefore the schema name is "S_EC", including the quotes?

amirulmenjeni commented 1 week ago

@adeola-ak, yes I've tried several variations, i.e., S_EC, s_ec, "S_EC"

adeola-ak commented 1 week ago

Hi there. It turns out that MSSQL is actually a community-supported integration, meaning that while we welcome contributions, we typically do not fully test or support it.

Since you were able to pinpoint where the issue might originate in our codebase, I encourage you to submit a PR contribution if you're interested, which we will promptly review.

I understand this may not be the outcome you were hoping for, but I want to thank you for your patience as I investigated the matter.

I will be keeping this issue open.