Closed igor-lobanov-maersk closed 6 months ago
Ok, see that this didn't work for Snowflake. Will take another look at syntax options.
@igor-lobanov-maersk I suspect that what you're seeing in the CI failures for dbt-snowflake is a result of the default quoting rules for dbt-snowflake:
You're changing it from being an unquoted identifier into a quoted one, but dbt-snowflake is configured to treat identifiers as unquoted.
I don't immediately know how to make all the adapters happy here.
The crux seems to be that (by default) dbt-snowflake doesn't want quoted identifiers, but dbt-dremio has count
as a reserved keyword for identifiers.
Some resources to help research and troubleshoot:
Maybe something like this (using quote config):
models:
- name: compare_queries
columns:
- name: count
quote: true
tests:
- dbt_utils.equality:
compare_model: ref('expected_results__compare_relations_without_exclude')
Thank you for getting back to me @dbeatty10, this is a good pointer. I will explore this further. It's a bit difficult for me to follow as I don't have access to a Snowflake environment, but I can try a few more things against the CI build.
Another way this could be addressed is if count
was actually renamed to something less contentious like row_count
. This should work everywhere. However, this would change 'public' interface of this module. Not sure to what extent it's a problem. What do you think?
Another way this could be addressed is if
count
was actually renamed to something less contentious likerow_count
. This should work everywhere. However, this would change 'public' interface of this module. Not sure to what extent it's a problem. What do you think?
This is an insightful idea and you're spot-on on the affect it would have on the public interface.
Since this has been a stable public interface for 4+ years, we'd be hesitant to change it.
We'd probably want to check if even merely changing this from unquoted to quoted would affect the public interface that folks may be using.
First, I'm going to just list out a bunch of ideas (and temporarily withhold judgement if they are "good" or not):
count
) as column namescount
as a column name (similarly to how it is apparently allowed by Snowflake, etc).dremio
snowflake
audit_helper.compare_queries
macro directly as-is
I'd prefer not to change the public interface since it's been stable. Although count
is a reserved keyword, it seems many databases allow it to be unquoted in practice (all the databases tested within this package as well as all the databases listed here):
I'm not guessing the Dremio maintainers would want to change they way they are handling count
, but that would be easiest for everyone else, and it would put them more in line with how other databases are handling it. It would also be a non-breaking change for users of Dremio. But we can't assume or control if that will happen.
I'd also strongly prefer not to add adapter-specific logic within audit_helper related specifically to Dremio. It would create maintenance issues of untested logic, and dbt packages should really be agnostic of details related to adapters.
Creating an adapter-level interface is actually my favorite idea, but it would be a more involved and long-term project than the scope of this PR.
Using dispatch is the most "dbt-onic" approach. It's takes a little bit more explanation for end users to use it, and it is necessary within each dbt project.
After reviewing all of the options and their trade-offs, dispatch is the way to go here. There's two options from the perspective of dbt-dremio maintainers:
Long-term, I'd still be quite interested in an adapter-level interface. i.e. an interface that can identify words that empirically determined to be invalid without quoting and/or an interface that surfaces how the database stores the casing of unquoted columns within its metadata.
This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.
Although we are closing this PR as stale, it can still be reopened to continue development. Just add a comment to notify the maintainers.
Escaping "count" field using double quotes, which I believe is SQL standard.
Description & motivation
I believe this is all that's required to fix #84
Checklist