ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.3k stars 596 forks source link

bug: postgres ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near for tables with XML Column Type #8610

Closed sarath-mec closed 3 months ago

sarath-mec commented 8 months ago

What happened?

I had issues with instantiating couple of tables in postgres Framework

SQL Script to Create Table
CREATE TABLE sales.store1 (
    businessentityid int4 NOT NULL,
    "name" public."Name" NOT NULL,
    salespersonid int4 NULL,
    demographics xml NULL,
    rowguid uuid DEFAULT uuid_generate_v1() NOT NULL,
    modifieddate timestamp DEFAULT now() NOT NULL,
    CONSTRAINT "PK_Store1_BusinessEntityID" PRIMARY KEY (businessentityid)
);

INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(292, 'Next-Door Bike Store', 279, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>United Security</BankName><BusinessType>BM</BusinessType><YearOpened>1996</YearOpened><Specialty>Mountain</Specialty><SquareFeet>21000</SquareFeet><Brands>2</Brands><Internet>ISDN</Internet><NumberEmployees>13</NumberEmployees></StoreSurvey>', 'a22517e3-848d-4ebe-b9d9-7437f3432304'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(294, 'Professional Sales and Service', 276, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>International Bank</BankName><BusinessType>BM</BusinessType><YearOpened>1991</YearOpened><Specialty>Touring</Specialty><SquareFeet>18000</SquareFeet><Brands>4+</Brands><Internet>T1</Internet><NumberEmployees>14</NumberEmployees></StoreSurvey>', 'b50ca50b-c601-4a13-b07e-2c63862d71b4'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(296, 'Riders Company', 277, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>Primary Bank &amp; Reserve</BankName><BusinessType>BM</BusinessType><YearOpened>1999</YearOpened><Specialty>Road</Specialty><SquareFeet>21000</SquareFeet><Brands>2</Brands><Internet>DSL</Internet><NumberEmployees>15</NumberEmployees></StoreSurvey>', '337c3688-1339-4e1a-a08a-b54b23566e49'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(298, 'The Bike Mechanics', 275, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>International Security</BankName><BusinessType>BM</BusinessType><YearOpened>1994</YearOpened><Specialty>Mountain</Specialty><SquareFeet>18000</SquareFeet><Brands>2</Brands><Internet>DSL</Internet><NumberEmployees>16</NumberEmployees></StoreSurvey>', '7894f278-f0c8-4d16-bd75-213fdbf13023'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(300, 'Nationwide Supply', 286, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>Guardian Bank</BankName><BusinessType>BM</BusinessType><YearOpened>1987</YearOpened><Specialty>Touring</Specialty><SquareFeet>21000</SquareFeet><Brands>4+</Brands><Internet>DSL</Internet><NumberEmployees>17</NumberEmployees></StoreSurvey>', 'c3fc9705-a8c4-4f3a-9550-eb2fa4b7b64d'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(302, 'Area Bike Accessories', 281, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>300000</AnnualSales><AnnualRevenue>30000</AnnualRevenue><BankName>International Bank</BankName><BusinessType>BM</BusinessType><YearOpened>1982</YearOpened><Specialty>Road</Specialty><SquareFeet>9000</SquareFeet><Brands>AW</Brands><Internet>T2</Internet><NumberEmployees>8</NumberEmployees></StoreSurvey>', '368be6dd-30e5-49bb-9a86-71fd49c58f4e'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(304, 'Bicycle Accessories and Kits', 283, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>300000</AnnualSales><AnnualRevenue>30000</AnnualRevenue><BankName>Primary Bank &amp; Reserve</BankName><BusinessType>BM</BusinessType><YearOpened>1990</YearOpened><Specialty>Mountain</Specialty><SquareFeet>7000</SquareFeet><Brands>AW</Brands><Internet>T1</Internet><NumberEmployees>9</NumberEmployees></StoreSurvey>', '35f40636-5105-49d5-869e-27e231189150'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(306, 'Clamps & Brackets Co.', 275, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>800000</AnnualSales><AnnualRevenue>80000</AnnualRevenue><BankName>International Security</BankName><BusinessType>BM</BusinessType><YearOpened>1985</YearOpened><Specialty>Mountain</Specialty><SquareFeet>17000</SquareFeet><Brands>4+</Brands><Internet>DSL</Internet><NumberEmployees>10</NumberEmployees></StoreSurvey>', '64d06bfc-d060-405c-8c60-c067fe7c67df'::uuid, '2014-09-12 11:15:07.497');
INSERT INTO sales.store1
(businessentityid, "name", salespersonid, demographics, rowguid, modifieddate)
VALUES(308, 'Valley Bicycle Specialists', 277, '<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"><AnnualSales>3000000</AnnualSales><AnnualRevenue>300000</AnnualRevenue><BankName>Primary Bank &amp; Reserve</BankName><BusinessType>OS</BusinessType><YearOpened>1979</YearOpened><Specialty>Mountain</Specialty><SquareFeet>72000</SquareFeet><Brands>4+</Brands><Internet>DSL</Internet><NumberEmployees>66</NumberEmployees></StoreSurvey>', '59386b0c-652e-4668-b44b-4e1711793330'::uuid, '2014-09-12 11:15:07.497');
Code to Produce Error
src_con = ibis.postgres.connect(
    host=src_host,
    database=src_database,
    user=src_user,
    password=src_password
)
table = src_con.table(name='store1', schema='sales', database='<database>')
Some Other Table DDLs which had issue
CREATE TABLE production.productmodel (
    productmodelid serial4 NOT NULL,
    "name" public."Name" NOT NULL,
    catalogdescription xml NULL,
    instructions xml NULL,
    rowguid uuid DEFAULT uuid_generate_v1() NOT NULL,
    modifieddate timestamp DEFAULT now() NOT NULL,
    CONSTRAINT "PK_ProductModel_ProductModelID" PRIMARY KEY (productmodelid)
);

CREATE TABLE production.illustration (
    illustrationid serial4 NOT NULL,
    diagram xml NULL,
    modifieddate timestamp DEFAULT now() NOT NULL,
    CONSTRAINT "PK_Illustration_IllustrationID" PRIMARY KEY (illustrationid)
);

CREATE TABLE person.person (
    businessentityid int4 NOT NULL,
    persontype bpchar(2) NOT NULL,
    "namestyle" public."NameStyle" DEFAULT false NOT NULL,
    title varchar(8) NULL,
    firstname public."Name" NOT NULL,
    middlename public."Name" NULL,
    lastname public."Name" NOT NULL,
    suffix varchar(10) NULL,
    emailpromotion int4 DEFAULT 0 NOT NULL,
    additionalcontactinfo xml NULL,
    demographics xml NULL,
    rowguid uuid DEFAULT uuid_generate_v1() NOT NULL,
    modifieddate timestamp DEFAULT now() NOT NULL,
    CONSTRAINT "CK_Person_EmailPromotion" CHECK (((emailpromotion >= 0) AND (emailpromotion <= 2))),
    CONSTRAINT "CK_Person_PersonType" CHECK (((persontype IS NULL) OR (upper((persontype)::text) = ANY (ARRAY['SC'::text, 'VC'::text, 'IN'::text, 'EM'::text, 'SP'::text, 'GC'::text])))),
    CONSTRAINT "PK_Person_BusinessEntityID" PRIMARY KEY (businessentityid)
);
Possible Reason is Having Columns with XML Type

Also Noticed Issues with JSONB Datatype

CREATE TABLE bookings.airports_data1 (
    airport_code bpchar(3) NOT NULL,
    airport_name jsonb NOT NULL,
    city jsonb NOT NULL,
    coordinates point NOT NULL,
    timezone text NOT NULL,
    CONSTRAINT airports_data1_pkey PRIMARY KEY (airport_code)
);
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('YKS', '{"en": "Yakutsk Airport", "ru": "Якутск"}'::jsonb, '{"en": "Yakutsk", "ru": "Якутск"}'::jsonb, '(129.77099609375,62.093299865722656)'::point, 'Asia/Yakutsk');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('MJZ', '{"en": "Mirny Airport", "ru": "Мирный"}'::jsonb, '{"en": "Mirnyj", "ru": "Мирный"}'::jsonb, '(114.03900146484375,62.534698486328125)'::point, 'Asia/Yakutsk');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('KHV', '{"en": "Khabarovsk-Novy Airport", "ru": "Хабаровск-Новый"}'::jsonb, '{"en": "Khabarovsk", "ru": "Хабаровск"}'::jsonb, '(135.18800354004,48.52799987793)'::point, 'Asia/Vladivostok');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('PKC', '{"en": "Yelizovo Airport", "ru": "Елизово"}'::jsonb, '{"en": "Petropavlovsk", "ru": "Петропавловск-Камчатский"}'::jsonb, '(158.45399475097656,53.16790008544922)'::point, 'Asia/Kamchatka');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('UUS', '{"en": "Yuzhno-Sakhalinsk Airport", "ru": "Хомутово"}'::jsonb, '{"en": "Yuzhno-Sakhalinsk", "ru": "Южно-Сахалинск"}'::jsonb, '(142.71800231933594,46.88869857788086)'::point, 'Asia/Sakhalin');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('VVO', '{"en": "Vladivostok International Airport", "ru": "Владивосток"}'::jsonb, '{"en": "Vladivostok", "ru": "Владивосток"}'::jsonb, '(132.1479949951172,43.39899826049805)'::point, 'Asia/Vladivostok');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('LED', '{"en": "Pulkovo Airport", "ru": "Пулково"}'::jsonb, '{"en": "St. Petersburg", "ru": "Санкт-Петербург"}'::jsonb, '(30.262500762939453,59.80030059814453)'::point, 'Europe/Moscow');
INSERT INTO bookings.airports_data1
(airport_code, airport_name, city, coordinates, timezone)
VALUES('KGD', '{"en": "Khrabrovo Airport", "ru": "Храброво"}'::jsonb, '{"en": "Kaliningrad", "ru": "Калининград"}'::jsonb, '(20.592599868774414,54.88999938964844)'::point, 'Europe/Kaliningrad');

What version of ibis are you using?

8.0.0

What backend(s) are you using, if any?

postgres

Relevant log output

---------------------------------------------------------------------------
SyntaxError                               Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in _exec_single_context(self, dialect, context, statement, parameters)
   1969                 if not evt_handled:
-> 1970                     self.dialect.do_execute(
   1971                         cursor, str_statement, effective_parameters, context

11 frames
/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    923     def do_execute(self, cursor, statement, parameters, context=None):
--> 924         cursor.execute(statement, parameters)
    925 

SyntaxError: syntax error at or near "sales"
LINE 1: ...s_postgres_metadata_23zhaspn5rdnpkatutxdtqleni AS sales.stor...
                                                             ^

The above exception was the direct cause of the following exception:

ProgrammingError                          Traceback (most recent call last)
<ipython-input-6-e99f48676c57> in <cell line: 1>()
----> 1 table = src_con.table(name='store', schema='sales', database='Adventureworks')

/usr/local/lib/python3.10/dist-packages/ibis/backends/base/sql/alchemy/__init__.py in table(self, name, database, schema)
    686         namespace = ops.Namespace(schema=schema, database=database)
    687 
--> 688         sqla_table = self._get_sqla_table(name, namespace=namespace)
    689 
    690         schema = self._schema_from_sqla_table(

/usr/local/lib/python3.10/dist-packages/ibis/backends/base/sql/alchemy/__init__.py in _get_sqla_table(self, name, namespace, autoload, **_)
    531             if not nulltype_cols:
    532                 return table
--> 533             return self._handle_failed_column_type_inference(table, nulltype_cols)
    534 
    535     # TODO(kszucs): remove the schema parameter

/usr/local/lib/python3.10/dist-packages/ibis/backends/base/sql/alchemy/__init__.py in _handle_failed_column_type_inference(self, table, nulltype_cols)
    584         )
    585 
--> 586         for colname, dtype in self._metadata(quoted_name):
    587             if colname in nulltype_cols:
    588                 # replace null types discovered by sqlalchemy with non null

/usr/local/lib/python3.10/dist-packages/ibis/backends/postgres/__init__.py in _metadata(self, query)
    275         with self.begin() as con:
    276             print(query)
--> 277             con.exec_driver_sql(f"CREATE TEMPORARY VIEW {name} AS {query}")
    278             try:
    279                 yield from (

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in exec_driver_sql(self, statement, parameters, execution_options)
   1780 
   1781         dialect = self.dialect
-> 1782         ret = self._execute_context(
   1783             dialect,
   1784             dialect.execution_ctx_cls._init_statement,

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
   1847             return self._exec_insertmany_context(dialect, context)
   1848         else:
-> 1849             return self._exec_single_context(
   1850                 dialect, context, statement, parameters
   1851             )

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in _exec_single_context(self, dialect, context, statement, parameters)
   1987 
   1988         except BaseException as e:
-> 1989             self._handle_dbapi_exception(
   1990                 e, str_statement, effective_parameters, cursor, context
   1991             )

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
   2354             elif should_wrap:
   2355                 assert sqlalchemy_exception is not None
-> 2356                 raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
   2357             else:
   2358                 assert exc_info[1] is not None

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py in _exec_single_context(self, dialect, context, statement, parameters)
   1968                             break
   1969                 if not evt_handled:
-> 1970                     self.dialect.do_execute(
   1971                         cursor, str_statement, effective_parameters, context
   1972                     )

/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    922 
    923     def do_execute(self, cursor, statement, parameters, context=None):
--> 924         cursor.execute(statement, parameters)
    925 
    926     def do_execute_no_params(self, cursor, statement, context=None):

ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "sales"
LINE 1: ...s_postgres_metadata_23zhaspn5rdnpkatutxdtqleni AS sales.stor...
                                                             ^

[SQL: CREATE TEMPORARY VIEW ibis_postgres_metadata_23zhaspn5rdnpkatutxdtqleni AS sales.store]
(Background on this error at: https://sqlalche.me/e/20/f405)


### Code of Conduct

- [X] I agree to follow this project's Code of Conduct
cpcloud commented 8 months ago

Thanks for the issue. It looks like there are a number of things to look at here:

  1. XML types
  2. The custom public.Name type
  3. The bogus SQL that's generated in collecting postgres metadata
sarath-mec commented 8 months ago

Thanks @cpcloud for the message. I have reported an issue with JSON & JSONB datatype in Postgres as well (most possible) and can be fixed along the same

jcrist commented 3 months ago

As of #9567 (included in ibis 9.2) we now map all non-natively-supported dtypes to our unknown type. This should let you load a table with any postgres dtype into ibis (but you won't be able to do much on unknown-typed columns without casting them to a supported type). Closing this for now, please feel free to reopen/open a new issue if you find additional issues.