GlareDB / glaredb

GlareDB: An analytics DBMS for distributed data
https://glaredb.com
GNU Affero General Public License v3.0
529 stars 33 forks source link

bug: InternalError: (psycopg2.errors.InternalError_) Casting expressions to regclass unsupported #2746

Open talagluck opened 2 months ago

talagluck commented 2 months ago

Description

See also #1577

When attempting to use Great Expectations with GlareDB (using a Postgres connection string), I get

InternalError: (psycopg2.errors.InternalError_) Casting expressions to regclass unsupported

Full error:

E               [SQL: SELECT pg_catalog.pg_attribute.attname AS name, pg_catalog.format_type(pg_catalog.pg_attribute.atttypid, pg_catalog.pg_attribute.atttypmod) AS format_type, (SELECT pg_catalog.pg_get_expr(pg_catalog.pg_attrdef.adbin, pg_catalog.pg_attrdef.adrelid) AS pg_get_expr_1 
E               FROM pg_catalog.pg_attrdef 
E               WHERE pg_catalog.pg_attrdef.adrelid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_attrdef.adnum = pg_catalog.pg_attribute.attnum AND pg_catalog.pg_attribute.atthasdef) AS "default", pg_catalog.pg_attribute.attnotnull AS not_null, pg_catalog.pg_class.relname AS table_name, pg_catalog.pg_description.description AS comment, pg_catalog.pg_attribute.attgenerated AS generated, (SELECT json_build_object(%(json_build_object_2)s, pg_catalog.pg_attribute.attidentity = %(attidentity_1)s, %(json_build_object_3)s, pg_catalog.pg_sequence.seqstart, %(json_build_object_4)s, pg_catalog.pg_sequence.seqincrement, %(json_build_object_5)s, pg_catalog.pg_sequence.seqmin, %(json_build_object_6)s, pg_catalog.pg_sequence.seqmax, %(json_build_object_7)s, pg_catalog.pg_sequence.seqcache, %(json_build_object_8)s, pg_catalog.pg_sequence.seqcycle) AS json_build_object_1 
E               FROM pg_catalog.pg_sequence 
E               WHERE pg_catalog.pg_attribute.attidentity != %(attidentity_2)s AND pg_catalog.pg_sequence.seqrelid = CAST(CAST(pg_catalog.pg_get_serial_sequence(CAST(CAST(pg_catalog.pg_attribute.attrelid AS REGCLASS) AS TEXT), pg_catalog.pg_attribute.attname) AS REGCLASS) AS OID)) AS identity_options 
E               FROM pg_catalog.pg_class LEFT OUTER JOIN pg_catalog.pg_attribute ON pg_catalog.pg_class.oid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_attribute.attnum > %(attnum_1)s AND NOT pg_catalog.pg_attribute.attisdropped LEFT OUTER JOIN pg_catalog.pg_description ON pg_catalog.pg_description.objoid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_description.objsubid = pg_catalog.pg_attribute.attnum JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 
E               WHERE pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s, %(param_3)s, %(param_4)s, %(param_5)s]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s AND pg_catalog.pg_class.relname IN (%(filter_names_1)s) ORDER BY pg_catalog.pg_class.relname, pg_catalog.pg_attribute.attnum]
E               [parameters: {'json_build_object_2': 'always', 'attidentity_1': 'a', 'json_build_object_3': 'start', 'json_build_object_4': 'increment', 'json_build_object_5': 'minvalue', 'json_build_object_6': 'maxvalue', 'json_build_object_7': 'cache', 'json_build_object_8': 'cycle', 'attidentity_2': '', 'attnum_1': 0, 'param_1': 'r', 'param_2': 'p', 'param_3': 'f', 'param_4': 'v', 'param_5': 'm', 'nspname_1': 'pg_catalog', 'filter_names_1': 'my_data'}]
E               (Background on this error at: https://sqlalche.me/e/20/2j85)

Steps to reproduce:

    # Set up a GlareDB Postgres server (we do this in our pytest suite)
    curr = glaredb_connection.cursor()

    curr.execute("create table my_data (amount int)")
    curr.execute(
        "INSERT INTO my_data (amount) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)"
    )
    curr.execute("select count(*) from my_data;")
    res = curr.fetchone()

    context = gx.get_context()  # gets a great expectations project context
    gx_data_source = context.sources.add_postgres(
        name="glaredb",
        connection_string="postgresql://test:test@localhost:5433/db",
    )

    gx_data_asset = gx_data_source.add_table_asset("my_data")

    batch_request = gx_data_asset.build_batch_request()

    batch_list = gx_data_asset.get_batch_list_from_batch_request(batch_request=batch_request)

    validator = context.get_validator(batch_list=batch_list)

    print(validator.expect_column_values_to_not_be_null("amount")) # This triggers the error
universalmind303 commented 1 month ago

So this query dives really deep into the pg catalog, and has a ton of tables and exprs that we currently don't support

Tables

Exprs

universalmind303 commented 1 month ago

marking this as on hold until we have a strategy for how to handle json_build_object.