dbt-labs / dbt-snowflake

dbt-snowflake contains all of the code enabling dbt to work with Snowflake
https://getdbt.com
Apache License 2.0
297 stars 177 forks source link

Fix #1102: Use `show ... starts with` instead of `show ... like` in `_show_object_metadata` #1107

Closed aranke closed 4 months ago

aranke commented 4 months ago

Resolves #1102

Problem

We want to use show...starts with instead of show...like in _show_object_metadata for faster query times, as per Snowflake guidance.

Solution

  1. Modify the snowflake__show_object_metadata macro
  2. Since starts with is case-sensitive, convert cases within the relation before sending off to macro; this is done via the new as_case_sensitive method.
  3. Write a test for as_case_sensitive to ensure expected behavior.
  4. Profit! These metadata queries should now take 100ms instead of 3s (see screenshot).

snowflake_query_runtimes

Closed Questions

Let’s say I have 3 tables:

my_table
my_table_testing
my_table_backup

❓ Am I guaranteed that show objects in schema my_schema starts with 'my_table' limit 1 will always return metadata for my_table and never for the other two? Since technically all three satisfy the starts_with 'my_table' condition.

Answer from Snowflake: Yes, in this case, as you have specified limit 1, the output will be my_table.

Checklist

mikealfare commented 4 months ago

Answer from Snowflake: Yes, in this case, as you have specified limit 1, the output will be my_table.

To be extra clear, this was Snowflake's response because my_table is an exact match, and not because it was the first in the example list, correct?

aranke commented 4 months ago

To be extra clear, this was Snowflake's response because my_table is an exact match, and not because it was the first in the example list, correct?

I'm not sure, and I'm not sure if it matters since the objects are always returned in alphabetical order.

show objects in ANALYTICS_DEV.DBT_KARANKE starts with 'STG_HUBSPOT__CONTACT';
created_on name database_name schema_name kind comment cluster_by rows bytes owner retention_time owner_role_type budget is_dynamic
2024-07-08 08:07:40.690 -0700 STG_HUBSPOT__CONTACT ANALYTICS_DEV DBT_KARANKE TABLE 639962 43180544 TRANSFORMER 1 ROLE N
2024-07-09 15:17:12.486 -0700 STG_HUBSPOT__CONTACT_FORM_SUBMISSIONS ANALYTICS_DEV DBT_KARANKE VIEW 0 0 TRANSFORMER 1 ROLE N
2024-07-08 07:52:06.624 -0700 STG_HUBSPOT__CONTACT_TMP ANALYTICS_DEV DBT_KARANKE VIEW 0 0 TRANSFORMER 1 ROLE N
show objects in ANALYTICS_DEV.DBT_KARANKE starts with 'STG_HUBSPOT__CONTACT' LIMIT 1;
created_on name database_name schema_name kind comment cluster_by rows bytes owner retention_time owner_role_type budget is_dynamic
2024-07-08 08:07:40.690 -0700 STG_HUBSPOT__CONTACT ANALYTICS_DEV DBT_KARANKE TABLE 639962 43180544 TRANSFORMER 1 ROLE N
mikealfare commented 4 months ago

I'm not sure if it matters since the objects are always returned in alphabetical order.

Agreed, and this is the confirmation I was looking for.