Open mantasmy opened 10 months ago
hey @mantasmy thanks for reaching out! We'll put this in our backlog for review.
Hey @HaebichanGX, is there any ETA for this please? Thanks
Bump
Hi @mantasmy. I also found the issue with QueryAsset validation through Trino as well. Hope it can help you. So, the GE itself cannot resolve the name of the SQLAlchemy Selectable in Trino's metadata (information_schema.*). Here is my approach (it worked~)
SqlAlchemyBatchData
Describe the bug
TL;DR: Query usage instead of a table isn't fully working (for Trino). Table name with catalog and schema is used at filtering in
information_schema.tables & information_schema.columns query
when table_name (inside these tables) is without catalog and schema. After removing catalog and schema locally it works well, but not for all expectations - fails to format query.Long Version: There is a special handling for Trino in utils.py in order to set the table_name from query. This regex sets table_name with schema and catalog and the problem occurs when we are using this table_name to build where condition for query.
This query is executed against information_schema tables, but table_name in information_schema.tables has no catalog and schema name.
I locally removed catalog and schema and it worked, but not for all expectations...
There are expectations like
expect_compound_columns_to_be_unique
where we have to set column_list as tuple or list and it fails to format query properly. Inside one of the subquery levels it is limited to select only columns that are in a column_list, but then parent level (of query) tries to fetch all columns that exist in a table... and this throws error about invalid columns.I will try to visualise it better here:
This is the connector I am using:
Environment: Great Expectations Version: 0.17.7 sqlalchemy: 1.4.48 trino: 0.326.0