snowflakedb / snowflake-sqlalchemy

Snowflake SQLAlchemy
https://pypi.python.org/pypi/snowflake-sqlalchemy/
Apache License 2.0
235 stars 152 forks source link

SNOW-1478655: `snowflake_clusterby` produces incorrect quoting when using expressions #505

Closed ravirahman closed 3 days ago

ravirahman commented 5 months ago

snowflake_clusterby produces incorrect quoting when using expressions

  1. What version of Python are you using?

Python 3.11.9 (main, Apr 2 2024, 08:25:04) [Clang 15.0.0 (clang-1500.3.9.4)]

  1. What operating system and processor architecture are you using?

macOS-14.5-arm64-arm-64bit

  1. What are the component versions in the environment (pip freeze)?

(trimming to just the snowflake packages -- the others are not relevant for this issue)

snowflake-connector-python==3.7.0
snowflake-sqlalchemy==1.5.0
  1. What did you do?
from sqlalchemy import Table, Column, MetaData
from sqlalchemy.schema import CreateTable
from snowflake.sqlalchemy import (
    STRING,
    TIMESTAMP_NTZ,
    dialect as snowflake_dialect
)

metadata = MetaData(bind=None, schema='database.schema', quote_schema=False)

test_table = Table('table', metadata,
                   Column('id', STRING, primary_key=True),
                   Column('created', TIMESTAMP_NTZ),
                   quote=False,
                   snowflake_clusterby=['id', 'to_date(created)']
                   )

sql_stmt = CreateTable(test_table).compile(dialect=snowflake_dialect())
print(sql_stmt)

Produced


CREATE TABLE database.schema.table (
    id VARCHAR NOT NULL, 
    created TIMESTAMP_NTZ, 
    PRIMARY KEY (id)
) CLUSTER BY (id, "to_date(created)")
  1. What did you expect to see?
CREATE TABLE database.schema.table (
    id VARCHAR NOT NULL,
    created TIMESTAMP WITHOUT TIME ZONE,
    PRIMARY KEY (id)
) CLUSTER BY (id, to_date("CREATED"))

(note where the quotes are and the capitalization transformation)

  1. Can you set logging to DEBUG and collect the logs?

There were no logs emitted

sfc-gh-dszmolka commented 5 months ago

hi - thanks for raising this. let me check and get back to you.

sfc-gh-dszmolka commented 5 months ago

confirmed with the dev team - right now, all input to snowflake_clusterby is indeed treated as strings, as you observed. we'll consider this ticket for future enhancement plans to ultimately generate a SQL which would be (rightfully) expected in the first place, even when using expression as cluster key.

sfc-gh-dszmolka commented 3 days ago

per release notes, I think this is resolved by version 1.7.0 in November 2024 release cycle