ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.3k stars 537 forks source link

bug: examples broken on mssql backend #9095

Open chloeh13q opened 2 weeks ago

chloeh13q commented 2 weeks ago

What happened?

Examples are broken on mssql backend.

Python 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:49:36) [Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ibis
>>> con = ibis.mssql.connect(user="sa", host="localhost", password="1bis_Testing!", driver="ODBC Driver 18 for SQL Server", TrustServerCertificate="yes", database="ibis_testing")
>>> ibis.set_backend(con)
>>> ibis.examples.CO2.fetch()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/chloe/Desktop/ibis/ibis/examples/__init__.py", line 85, in fetch
    return backend.create_table(table_name, obj, temp=True, overwrite=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chloe/Desktop/ibis/ibis/backends/mssql/__init__.py", line 503, in create_table
    with self._safe_raw_sql(create_stmt) as cur:
  File "/opt/anaconda3/envs/ibis-dev-arm64/lib/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/Users/chloe/Desktop/ibis/ibis/backends/mssql/__init__.py", line 269, in _safe_raw_sql
    cur.execute(query, *args, **kwargs)
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Incorrect syntax near 'ibis_mssql_table_4ksb7p5z7vfqffu77cve3a3kbq'. (102) (SQLExecDirectW)")

Seems to be broken on all the examples.

What version of ibis are you using?

main

What backend(s) are you using, if any?

mssql

Relevant log output

No response

Code of Conduct

gforsyth commented 2 weeks ago

Hmm, yeah, something funky is definitely happening here. The create table SQL we're generating (this is for the starwars example): is

'CREATE TABLE #[ibis_mssql_table_sf22odm3xvcjxhkiuk5pk6o4py] ([name] VARCHAR(max), [height] BIGINT, [mass] FLOAT, [hair_color] VARCHAR(max), [skin_color] VARCHAR(max), [eye_color] VARCHAR(max), [birth_year] FLOAT, [sex] VARCHAR(max), [gender] VARCHAR(max), [homeworld] VARCHAR(max), [species] VARCHAR(max), [films] VARCHAR(max), [vehicles] VARCHAR(max), [starships] VARCHAR(max))'

Definitely don't need or want the #[] around the table name

gforsyth commented 2 weeks ago

hmm, wait, I guess that's the mssql syntax for temporary tables? digging in a bit more

gforsyth commented 2 weeks ago

I think this is a sqlglot bug, although we may be constructing the expression incorrectly.

[nav] In [14]: query = """CREATE TABLE #crunch_no_brackets ([name] VARCHAR(max))"""

[ins] In [15]: with con.raw_sql(query) as c:
          ...:     pass

[ins] In [16]: query = """CREATE TABLE [brackets_no_crunch] ([name] VARCHAR(max))"""

[ins] In [17]: with con.raw_sql(query) as c:
          ...:     pass

[ins] In [18]: query = """CREATE TABLE #[brackets_with_crunch] ([name] VARCHAR(max))"""

[ins] In [19]: with con.raw_sql(query) as c:
          ...:     pass
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
Cell In[19], line 1
----> 1 with con.raw_sql(query) as c:
      2     pass

File ~/github.com/ibis-project/ibis/ibis/backends/mssql/__init__.py:279, in Backend.raw_sql(self, query, **kwargs)
    276 cursor = con.cursor()
    278 try:
--> 279     cursor.execute(query, **kwargs)
    280 except Exception:
    281     con.rollback()

ProgrammingError: ('42000', "[42000] [FreeTDS][SQL Server]Incorrect syntax near 'brackets_with_crunch'. (102) (SQLExecDirectW)")
gforsyth commented 2 weeks ago

I think we're doing something wrong:

[ins] In [25]: exp = sg.parse_one("CREATE TEMPORARY TABLE temptest (name VARCHAR);",
          ...: dialect="duckdb")

[ins] In [26]: exp
Out[26]: 
Create(
  this=Schema(
    this=Table(
      this=Identifier(this=temptest, quoted=False)),
    expressions=[
      ColumnDef(
        this=Identifier(this=name, quoted=False),
        kind=DataType(this=Type.VARCHAR, nested=False))]),
  kind=TABLE,
  properties=Properties(
    expressions=[
      TemporaryProperty()]))

[ins] In [27]: exp.sql(dialect="mssql")
Out[27]: 'CREATE TABLE #temptest (name VARCHAR)'
gforsyth commented 2 weeks ago

Ok, I think that there are some edge-cases that we are definitely not handling correctly.

That said, there is also a sqlglot bug:

[ins] In [1]: import sqlglot as sg

[ins] In [2]: exp = sg.parse_one(
         ...:     "CREATE TEMPORARY TABLE 'temptest' (name VARCHAR);", dialect="duckd
         ...: b"
         ...: )

[ins] In [3]: exp
Out[3]: 
Create(
  this=Schema(
    this=Table(
      this=Identifier(this=temptest, quoted=True)),
    expressions=[
      ColumnDef(
        this=Identifier(this=name, quoted=False),
        kind=DataType(this=Type.VARCHAR, nested=False))]),
  kind=TABLE,
  properties=Properties(
    expressions=[
      TemporaryProperty()]))

[ins] In [4]: exp.sql(dialect="tsql")
Out[4]: 'CREATE TABLE #[temptest] (name VARCHAR)'

That last should be CREATE TABLE [#temptest] (name VARCHAR)

georgesittas commented 2 weeks ago

FYI the temp table fix has been deployed with 23.13.0.