snowflakedb / snowflake-sqlalchemy

Snowflake SQLAlchemy
https://pypi.python.org/pypi/snowflake-sqlalchemy/
Apache License 2.0
231 stars 151 forks source link

SNOW-956272: SnowflakeDialect doesn't properly process schema_translate_map for tables that include quotes in their schemas #459

Open peterfoley opened 11 months ago

peterfoley commented 11 months ago

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

Python 3.9.17 (main, Jul 5 2023, 16:17:03) [Clang 14.0.6 ]

  1. What operating system and processor architecture are you using?

macOS-10.16-x86_64-i386-64bit

  1. What are the component versions in the environment (pip freeze)?

snowflake-connector-python==3.2.1 snowflake-sqlalchemy==1.4.6 SQLAlchemy==1.4.46

  1. What did you do?

Made this example that shows schema_translate_map failing to translate quoted schemas when dialect = SnowflakeDialect():

import sqlalchemy as sa
from snowflake.sqlalchemy.snowdialect import SnowflakeDialect
simple_schema = 'simple_schema'
schema_with_quotes = '"schema_with_quotes"'
tables = [
    sa.Table(name, sa.MetaData(), sa.Column("column1", sa.String()), schema=schema, quote_schema=False, quote=False)
    for name, schema in [("simple_table", simple_schema), ("quoted_table", schema_with_quotes)]
]
dialects = [None, SnowflakeDialect()]

schema_translate_map = {
    simple_schema: "simple_translated",
    schema_with_quotes: "quoted_translated",
    }

def compile(table, dialect):
    query = sa.select(sa.literal_column("1")).select_from(table)
    compiler = query.compile(dialect=dialect, schema_translate_map=schema_translate_map, render_schema_translate=True)
    return str(compiler)

for table in tables:
    for dialect in dialects:
        dialect_name = "SnowflakeDialect" if dialect else "None"
        print(f"---table={table.name} with dialect={dialect_name}---")
        print(compile(table,dialect))
  1. What did you expect to see?

I would have expected all the outputs to be the_translated version, but I got this instead:

---table=simple_table with dialect=None---
SELECT 1 
FROM simple_translated.simple_table
---table=simple_table with dialect=SnowflakeDialect---
SELECT 1 
FROM simple_translated.simple_table
---table=quoted_table with dialect=None---
SELECT 1 
FROM quoted_translated.quoted_table
---table=quoted_table with dialect=SnowflakeDialect---
SELECT 1 
FROM schema_with_quotes.].quoted_table

Note that the failing case does modify the schema, but doesn't fully translate it and tacks on an extra .] which looks suspiciously like the regexes in SnowflakeIdentifierPreparer are interacting poorly with sqlalchemy.sql.compiler.IdentifierPreparer's _with_schema_translate and _render_schema_translate.

  1. Can you set logging to DEBUG and collect the logs?

yes, but there are only two additional lines of output with DEBUG:

2023-10-31 18:19:58,521 - MainThread ssl_wrap_socket.py:44 - inject_into_urllib3() - DEBUG - Injecting ssl_wrap_socket_with_ocsp
2023-10-31 18:19:58,521 - MainThread _auth.py:91 - <module>() - DEBUG - cache directory:  ###HIDDEN###