dbt-labs / dbt-snowflake

dbt-snowflake contains all of the code enabling dbt to work with Snowflake
https://getdbt.com
Apache License 2.0
296 stars 176 forks source link

[Bug] Setting the enable_iceberg_materializations flag to True on a dbt project that is connected to a Snowflake account with QUOTED_IDENTIFIERS_IGNORE_CASE = true results in error tuple.index(x): x not in tuple #1227

Closed Tonayya closed 1 week ago

Tonayya commented 2 weeks ago

Is this a new bug in dbt-snowflake?

Current Behavior

Currently in dbt Cloud in order to create Iceberg tables, you must implement a behavior flag due to performance impact related to using Iceberg tables. Snowflake does not support is_iceberg on the Show Objects query, which dbt depends on for metadata. To use Iceberg, you need to set the enable_iceberg_materializations flag to True in your dbt_project.yml:

flags:
  enable_iceberg_materializations: True

However, if the Snowflake account that is connected to your dbt Cloud project has the flag QUOTED_IDENTIFIERS_IGNORE_CASE set to True, this results in the following error:

tuple.index(x): x not in tuple

This appears to occur when we run the following query:

select all_objects.*, is_iceberg as "is_iceberg"
      from table(result_scan(last_query_id(-1))) all_objects
      left join INFORMATION_SCHEMA.tables as all_tables
        on all_tables.table_name = all_objects."name"
        and all_tables.table_schema = all_objects."schema_name"
        and all_tables.table_catalog = all_objects."database_name"

Expected Behavior

dbt should be able to handle quoting/casing while this flag is set to True in Snowflake.

Steps To Reproduce

  1. Set QUOTED_IDENTIFIERS_IGNORE_CASE to True in Snowflake:
ALTER ACCOUNT SET QUOTED_IDENTIFIERS_IGNORE_CASE = TRUE;
  1. Set flag enable_iceberg_materializations to True in dbt_project.yml file:
    flags:
    enable_iceberg_materializations: True
  2. Try to run anything in your dbt Cloud project (IDE):
    dbt build --select some_model

From the debug logs:

02:50:27 Running dbt...
02:50:27 running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'fail_fast': 'False', 'version_check': 'True', 'log_path': '/usr/src/dbt-server-shared/working_dir/b580a02e-eff8-4c74-bca7-04bc720476ad', 'profiles_dir': '/usr/src/dbt-server-shared/.dbt', 'debug': 'False', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'log_format': 'json', 'introspect': 'True', 'target_path': 'None', 'invocation_command': 'dbt -A dbt_worker.app worker --loglevel=DEBUG --concurrency=2 --max-memory-per-child=500000', 'send_anonymous_usage_stats': 'True'}
02:50:27 Discovered Exposures:
[]
02:50:27 Registered adapter: snowflake=1.9.0-post10+78f86674bc726000686a484c4d2f0e2f9d350ed1
02:50:27 checksum: b27b306c093216d1116c3927e0401dd24e00d277a1f26f4c8f3a5342a1df6db2, vars: {}, profile: user, target: , version: 2024.10.28+996c6a8
02:50:27 Observability Metric: msgpack_manifest_bytes=481202.0
02:50:27 Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
02:50:27 Partial parsing enabled, no changes found, skipping parsing
02:50:27 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.my_new_project.example
02:50:27 Observability Metric: path_count=143.0
02:50:27 Observability Metric: parsed_path_count=0.0
02:50:27 Observability Metric: read_files_elapsed=0.012346978299319744
02:50:27 Observability Metric: load_macros_elapsed=0.0
02:50:27 Observability Metric: parse_project_elapsed=0.0
02:50:27 Observability Metric: patch_sources_elapsed=0.0
02:50:27 Observability Metric: process_manifest_elapsed=0.0
02:50:27 Observability Metric: load_all_elapsed=0.029497478157281876
02:50:27 Observability Metric: is_partial_parse_enabled=1.0
02:50:27 Found 2 models, 468 macros
02:50:27 
02:50:27 Concurrency: 6 threads (target='default')
02:50:27 
02:50:27 Acquiring new snowflake connection 'master'
02:50:27 Acquiring new snowflake connection 'list_ANALYTICS'
02:50:27 Using snowflake connection "list_ANALYTICS"
02:50:27 On list_ANALYTICS: /* {"app": "dbt", "dbt_version": "2024.10.28+996c6a8", "profile_name": "user", "target_name": "default", "connection_name": "list_ANALYTICS"} */
show terse schemas in database ANALYTICS
    limit 10000
02:50:27 Opening a new connection, currently in state init
02:50:27 SQL status: SUCCESS 2 in 0.659 seconds
02:50:27 Re-using an available connection from the pool (formerly list_ANALYTICS, now list_ANALYTICS_dev_tonayya)
02:50:27 Using snowflake connection "list_ANALYTICS_dev_tonayya"
02:50:27 On list_ANALYTICS_dev_tonayya: /* {"app": "dbt", "dbt_version": "2024.10.28+996c6a8", "profile_name": "user", "target_name": "default", "connection_name": "list_ANALYTICS_dev_tonayya"} */
show objects in ANALYTICS.dev_tonayya limit 10000;
02:50:27 SQL status: SUCCESS 3 in 0.163 seconds
02:50:27 Using snowflake connection "list_ANALYTICS_dev_tonayya"
02:50:27 On list_ANALYTICS_dev_tonayya: /* {"app": "dbt", "dbt_version": "2024.10.28+996c6a8", "profile_name": "user", "target_name": "default", "connection_name": "list_ANALYTICS_dev_tonayya"} */
select all_objects.*, is_iceberg as "is_iceberg"
      from table(result_scan(last_query_id(-1))) all_objects
      left join INFORMATION_SCHEMA.tables as all_tables
        on all_tables.table_name = all_objects."name"
        and all_tables.table_schema = all_objects."schema_name"
        and all_tables.table_catalog = all_objects."database_name"
02:50:29 SQL status: SUCCESS 3 in 1.433 seconds
02:50:29 Connection 'master' was properly closed.
02:50:29 Connection 'list_ANALYTICS_dev_tonayya' was left open.
02:50:29 On list_ANALYTICS_dev_tonayya: Close
02:50:29 
02:50:29 Finished running  in 0 hours 0 minutes and 2.46 seconds (2.46s).
02:50:29 Encountered an error:
tuple.index(x): x not in tuple
02:50:29 Traceback (most recent call last):
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 222, in wrapper
    result, success = func(*args, **kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 132, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 334, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 361, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 413, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/requires.py", line 435, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/cli/main.py", line 206, in build
    results = task.run()
              ^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/task/runnable.py", line 608, in run
    result = self.execute_with_hooks(selected_uids)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/task/runnable.py", line 546, in execute_with_hooks
    before_run_status = self.before_run(adapter, selected_uids)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/task/run.py", line 756, in before_run
    self.populate_adapter_cache(adapter, required_schemas)
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/task/runnable.py", line 505, in populate_adapter_cache
    adapter.set_relations_cache(cachable_nodes)
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/adapters/base/impl.py", line 552, in set_relations_cache
    self._relations_cache_for_schemas(relation_configs, required_schemas)
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/adapters/base/impl.py", line 528, in _relations_cache_for_schemas
    for relation in future.result():
                    ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/ddtrace/contrib/futures/threading.py", line 44, in _wrap_execution
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt_common/utils/executor.py", line 16, in connected
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/dbt/adapters/snowflake/impl.py", line 266, in list_relations_without_caching
    return [self._parse_list_relations_result(obj) for obj in schema_objects.select(columns)]
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/agate/table/select.py", line 18, in select
    indexes = tuple(self._column_names.index(k) for k in key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/dbt-996c6a882dc72d24e24395693ebca65139cb1575/lib/python3.11/site-packages/agate/table/select.py", line 18, in <genexpr>
    indexes = tuple(self._column_names.index(k) for k in key)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: tuple.index(x): x not in tuple

02:50:29 Resource report: {"command_name": "build", "command_wall_clock_time": 2.520353, "process_user_time": 5.677027, "process_kernel_time": 1.493451, "process_mem_max_rss": "322364", "process_in_blocks": "264", "process_out_blocks": "32088", "command_success": false}
02:50:29 Observability Metric: command_success=0.0
02:50:29 Observability Metric: command_wall_clock_time=2.520353078842163
02:50:29 Observability Metric: process_user_time=5.677027225494385
02:50:29 Observability Metric: process_kernel_time=1.4934509992599487
02:50:29 Observability Metric: process_mem_max_rss=322364.0
02:50:29 Command `cli build` failed at 02:50:29.523857 after 2.52 seconds

Relevant log output

No response

Environment

- OS:
- Python:
- dbt-core:
- dbt-snowflake:

Additional Context

This is fairly similar to the bug reported here