Open jeremyyeo opened 3 months ago
^ Ah ok so here is why Case (B) works... we both upper and lower the column names and match it in the column dict... therefore always lowercased
or UPPERCASED
would match something for sure....
In the exact scenario above (mixed_Case
in both model and yaml and not-quoted)... we can override the built in macro:
-- macros/get_column_comment_sql.sql
{% macro get_column_comment_sql(column_name, column_dict) -%}
{#/* Create new uppercased key column dict. */#}
{% set fixed_case_column_dict = {} %}
{% for k, v in column_dict.items() %}
{% do fixed_case_column_dict.update({k.upper(): v}) %}
{% endfor %}
{% if (column_name in fixed_case_column_dict) %}
{% set matched_column = column_name -%}
{% else %}
{% set matched_column = None -%}
{% endif %}
{% if matched_column -%}
{{ adapter.quote(column_name) }} COMMENT $${{ fixed_case_column_dict[matched_column]['description'] | replace('$', '[$]') }}$$
{%- else -%}
{{ adapter.quote(column_name) }} COMMENT $$$$
{%- endif -%}
{% endmacro %}
Then:
$ dbt --debug run
01:31:36 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table development_jyeo.dbt_jyeo.foo
as
(
select 1 as mixed_Case
);
01:31:36 Opening a new connection, currently in state closed
01:31:37 SQL status: SUCCESS 1 in 2.0 seconds
01:31:38 Using snowflake connection "model.my_dbt_project.foo"
01:31:38 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
describe table development_jyeo.dbt_jyeo.foo
01:31:38 SQL status: SUCCESS 1 in 0.0 seconds
01:31:38 Using snowflake connection "model.my_dbt_project.foo"
01:31:38 On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.foo"} */
alter table development_jyeo.dbt_jyeo.foo alter
"MIXED_CASE" COMMENT $$Col description$$;
01:31:38 SQL status: SUCCESS 1 in 0.0 seconds
01:31:38 On model.my_dbt_project.foo: Close
01:31:38 Sending event: {'category': 'dbt', 'action': 'run_model', 'label': 'e30c8d1b-5b26-4c3e-bc87-82e19958136a', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x13938f0d0>]}
01:31:38 1 of 1 OK created sql table model dbt_jyeo.foo ................................. [SUCCESS 1 in 2.78s]
Is this a new bug in dbt-snowflake?
Current Behavior
If users have
columns.name
key in a mixed case - dbt does not correctly match the uppercased column name that comes from adescribe table ...
with what is in theschema.yml
file - therefor, column descriptions aren't added to the column.Expected Behavior
During the "matching process" - we should always uppercase the
columns.name
key no matter the case if the user doesn't specify thequote
config on the column (https://docs.getdbt.com/reference/resource-properties/quote).Steps To Reproduce
Project setup:
Run
^ What happens here....
select 1 as mixed_Case
- and what happens is Snowflake has this column as a uppercasedMIXED_CASE
(this is the default behaviour of Snowflake).MIXED_CASE
exist.MIXED_CASE
that comes from the describe, does not match what is in theschema.yml
which has that asmixed_Case
."MIXED_CASE" != "mixed_Case"
so we don't find a matching column description.What can users do about this
(A) They can quote the column name in the model
Here Snowflake has created a literal mixed case column name
mixed_Case
- the describe returns this exactly and we get a match in the schema.yml file -"mixed_Case" = "mixed_Case"
- therefor, we can apply the column description appropriately.Cons
Their table will have a literal mixed case column name which they may not want.
(B) They can uppercase / lowercase the column name in the schema yaml
For some reason, if the
columns.name
key is either:UPPERCASED
.lowercased
.dbt has some internal methods to match this correctly.
Cons
User have to have a different column casing style between their model code and their schema yml files.
Relevant log output
No response
Environment
Additional Context
If users don't specify the quote column config (https://docs.getdbt.com/reference/resource-properties/quote) - I propose we always uppercase the
columns.name
key like (B) above - even if the user is going to specify a mixed case column name in the schema yml file.