snowflakedb / spark-snowflake

Snowflake Data Source for Apache Spark.
http://www.snowflake.net
Apache License 2.0
217 stars 99 forks source link

default value for option keep_column_case is "on" #570

Open flo-osimard opened 3 months ago

flo-osimard commented 3 months ago

Using Databricks Runtime "15.3 ML (includes Apache Spark 3.5.0, Scala 2.12)" which packages

/databricks/jars/----ws_3_5--third_party--snowflake-jdbc--net.snowflakesnowflake-jdbcshaded---414110472--net.snowflakesnowflake-jdbc3.16.1.jar

we read tables from format("delta") and write the tables using the snowflake connector in format("snowflake"). The default behavior with previous versions of the connector (before RT 15.3ML) was identifiers named in capital letters in snowflake, same as would setting the option:

keep_column_case is = "off"

With 15.3 ML the tables saved in snowflake have lower cases and are double-quoted, as if the default behavior was

keep_column_case is = "on" ex: "AccountId" VARCHAR(16777216),

by changing our options and setting the option explicitly : keep_column_case is = "off" we recover the default behavior. ex: ACCOUNTID VARCHAR(16777216),

From our point of view the default option value is not the one claimed in the documentation: it is not "off", but "on". So either there is a bug or the documentation needs to be updated.

Thanks for investigating,

jwilkinson-bread commented 3 months ago

We are also seeing this