dbt-labs / dbt-codegen

Macros that generate dbt code
https://hub.getdbt.com/dbt-labs/codegen/latest/
Apache License 2.0
480 stars 106 forks source link

`generate_source` not working on Snowflake due to column case #82

Closed boblannon-fp closed 1 year ago

boblannon-fp commented 2 years ago

Describe the bug

When attempting to generate a source, an opaque error is generated. I've confirmed that the SELECT ... FROM INFORMATION_SCHEMA query does in fact return results, but the macro still fails.

Steps to reproduce

dbt run-operation generate_source --args 'schema_name: STG_BOOST'

Expected results

Generated sources.yml output

Actual results

Python runtime error:

$ dbt run-operation generate_source --args 'schema_name: STG_BOOST'
11:52:37  Running with dbt=1.2.2
11:52:37  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.insight_transformation.example

11:52:39  Encountered an error while running operation: Runtime Error
  Field "path" of type Path in SnowflakeRelation has invalid value {'database': 'INSIGHT_DEV2', 'schema': Undefined, 'identifier': Undefined}

Screenshots and log output

============================== 2022-10-17 11:52:37.691786 | 27ca7160-d499-4ada-920d-4e2036661d3a ==============================
11:52:37.691807 [info ] [MainThread]: Running with dbt=1.2.2
11:52:37.694314 [debug] [MainThread]: running dbt with arguments {'write_json': True, 'use_colors': True, 'printer_width': 80, 'version_check': True, 'partial_parse': True, 'static_parser': True, 'profiles_dir': '/dbt-runner/insight_transformation', 'send_anonymous_usage_stats': True, 'event_buffer_size': 100000, 'quiet': False, 'no_print': False, 'macro': 'generate_source', 'args': 'schema_name: STG_BOOST', 'which': 'run-operation', 'rpc_method': 'run-operation', 'indirect_selection': 'eager'}
11:52:37.697333 [debug] [MainThread]: Tracking: tracking
11:52:37.699636 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff77be4400>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff77be4b80>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff77be48e0>]}
11:52:37.781163 [debug] [MainThread]: Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
11:52:37.783317 [debug] [MainThread]: Partial parsing enabled, no changes found, skipping parsing
11:52:37.785223 [warn ] [MainThread]: [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.insight_transformation.example

11:52:37.792267 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '27ca7160-d499-4ada-920d-4e2036661d3a', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff7789e0d0>]}
11:52:37.822391 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '27ca7160-d499-4ada-920d-4e2036661d3a', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff779fe5e0>]}
11:52:37.824686 [debug] [MainThread]: Acquiring new snowflake connection "macro_generate_source"
11:52:37.849402 [debug] [MainThread]: Using snowflake connection "macro_generate_source"
11:52:37.851521 [debug] [MainThread]: On macro_generate_source: /* {"app": "dbt", "dbt_version": "1.2.2", "profile_name": "insight_transformation", "target_name": "default", "connection_name": "macro_generate_source"} */
select distinct
            table_schema as "table_schema",
            table_name as "table_name",

            case table_type
                when 'BASE TABLE' then 'table'
                when 'EXTERNAL TABLE' then 'external'
                when 'MATERIALIZED VIEW' then 'materializedview'
                else lower(table_type)
            end as "table_type"

        from INSIGHT_DEV2.information_schema.tables
        where table_schema ilike 'STG_BOOST'
        and table_name ilike '%'
        and table_name not ilike ''
11:52:37.853695 [debug] [MainThread]: Opening a new connection, currently in state init
11:52:39.075223 [debug] [MainThread]: SQL status: SUCCESS 28 in 1.22 seconds
11:52:39.097273 [debug] [MainThread]: Snowflake adapter: Error running SQL: macro generate_source
11:52:39.104811 [debug] [MainThread]: Snowflake adapter: Rolling back transaction.
11:52:39.108034 [debug] [MainThread]: On macro_generate_source: Close
11:52:39.197696 [error] [MainThread]: Encountered an error while running operation: Runtime Error
  Field "path" of type Path in SnowflakeRelation has invalid value {'database': 'INSIGHT_DEV2', 'schema': Undefined, 'identifier': Undefined}
11:52:39.205033 [debug] [MainThread]:
11:52:39.208489 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff778abfa0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff778abbe0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0xffff778aba00>]}
11:52:39.213634 [debug] [MainThread]: Flushing usage events
11:52:39.784175 [debug] [MainThread]: Connection 'macro_generate_source' was properly closed.

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.9.2
  - package: dbt-labs/codegen
    version: 0.8.1
  - package: Snowflake-Labs/dbt_constraints
    version: 0.5.1
  - package: calogica/dbt_expectations
    version: 0.7.0

Which database are you using dbt with?

The output of dbt --version:

$ dbt --version
Core:
  - installed: 1.2.2
  - latest:    1.3.0 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 1.2.0 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

The operating system you're using: Working in a docker container, using base image python:3.9-slim

$ uname -a
Linux 5d79fa681533 5.10.104-linuxkit #1 SMP PREEMPT Thu Mar 17 17:05:54 UTC 2022 aarch64 GNU/Linux

The output of python --version:

$ python --version
Python 3.9.15

Additional context

The problem seems to be here, in dbt_utils macro get_relations_by_pattern.

https://github.com/dbt-labs/dbt-utils/blob/064c2abee80816ea855de32e66e54e96d9916677/macros/sql/get_relations_by_pattern.sql#L18-L23

The columns in the query result are named correctly, but they're returned in all caps. If I change those lines to the following, then the command works as expected.

            {%- set tbl_relation = api.Relation.create(
                database=database,
                schema=row.TABLE_SCHEMA,
                identifier=row.TABLE_NAME,
                type=row.TABLE_TYPE
            ) -%}

I'm not 100% sure where a change needs to be made, it's possible that it's actually get_tables_by_pattern_sql.sql that needs to be changed. I'm not sure whether it's possible to force snowflake to return results with the correct case. I would have expected snowflake to use the quoted column aliases in the query, which are lower...

Are you interested in contributing the fix?

I probably won't have time any time soon.

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

vanducng commented 1 year ago

Can we reopen this, as Im facing the same issue.

roseborja-brg commented 4 months ago

Can we reopen this please?