dbt-labs / dbt-redshift

dbt-redshift contains all of the code enabling dbt to work with Amazon Redshift
https://getdbt.com
Apache License 2.0
95 stars 58 forks source link

[CT-1616] get_relation method returns None for Redshift external tables #17

Open tomekzbrozek opened 4 years ago

tomekzbrozek commented 4 years ago

Describe the bug

get_relation method (https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/adapter/#get_relation) returns None when provided with database, schema and table name of a Redshift external table (AWS Spectrum).

Steps To Reproduce

  1. Create an external table with AWS Spectrum: https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html
  2. Attempt to get_relation on that external table.

Expected behavior

Should return a regular relation as it does for native Redshift tables, e.g. an output like analytics.spectrum_schema.mytable instead of None

Screenshots and log output

-

System information

Which database are you using dbt with?

The output of dbt --version:

0.16.0

The operating system you're using: MacOS Mojave

The output of python --version: 3.6.2

Additional context

-

drewbanin commented 4 years ago

Thanks for the report @tomekzbrozek!

Looks like we use the Postgres code to query the information schema on redshift: https://github.com/fishtown-analytics/dbt/blob/dev/octavius-catto/plugins/redshift/dbt/include/redshift/macros/adapters.sql#L154

but redshift provides a totally different set of tables for fetching this info: https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_EXTERNAL_TABLES.html

I think we'd need to copy the "base" pg implementation into the redshift plugin, then update the code to also fetch external tables too.

tomekzbrozek commented 4 years ago

thank you, yup I also used SVV_EXTERNAL_TABLES as a workaround to the issue reported here, pasting below to give a flavour.

({{ schema }} and {{ table }} vars are defined earlier in the body of my macro that serves a different purpose, I basically needed to know if the external table exists of not, this is why I came across this issue :) )

-- check_if_external_table_exists returns 1 if a given external table exists, 0 if doesn't
{% set check_if_external_table_exists %}
SELECT
  COUNT(*) AS if_exists_flag
FROM
  SVV_EXTERNAL_TABLES
WHERE
  schemaname || '.' || tablename = '{{ schema }}.{{ table }}'
{% endset %}
-- the loop below fetches results of check_if_external_table_exists:
-- 0 if table doesn't exist, 1 if table exists, 2 for other (erroneous?) cases
-- more context on the `execute` Jinja variable:
-- https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/execute
{% if execute %}
  {% set if_exists_flag = run_query(check_if_external_table_exists).columns[0].values()[0] | int %}
{% else %}
  {% set if_exists_flag = 2 | int %}
{% endif %}
github-actions[bot] commented 2 years ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

jtcohen6 commented 2 years ago

Still a bug, still a good first issue for an interested community contributor :)

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

dbeatty10 commented 1 year ago

Resolved by https://github.com/dbt-labs/dbt-redshift/pull/428

dbeatty10 commented 1 year ago

This might not actually be resolved by dbt-labs/dbt-core#428 😅

Will either re-open this issue or keep closed depending on the outcome of that conversation.

If it is re-opened, the solution may look similar to https://github.com/dbt-labs/dbt-postgres/issues/53, and it may make sense to solve them both at the same time.

trymzet commented 10 months ago

FWIW I've adapted @dbeatty10's solution for Spectrum and it worked for me (using dbt-redshift==1.3.0 and dbt-core==1.3.4):

{% macro redshift__list_relations_without_caching(schema_relation) %}

  {% call statement('list_relations_without_caching', fetch_result=True) -%}
    select
        table_catalog as database,
        table_name as name,
        table_schema as schema,
        'table' as type
    from information_schema.tables
    where table_schema ilike '{{ schema_relation.schema }}'
    and table_type = 'BASE TABLE'
    union all
    select
      table_catalog as database,
      table_name as name,
      table_schema as schema,
      case
        when view_definition ilike '%create materialized view%'
          then 'materialized_view'
        else 'view'
      end as type
    from information_schema.views
    where table_schema ilike '{{ schema_relation.schema }}'
+    union all
+    select 
+        redshift_database_name as database,
+        tablename as name,
+        schemaname as schema,
+        'table' as type
+    from svv_external_tables
+    where schemaname ilike '{{ schema_relation.schema }}'
  {% endcall %}

  {{ return(load_result('list_relations_without_caching').table) }}

{% endmacro %}