dbt-labs / dbt-postgres

Apache License 2.0
31 stars 12 forks source link

[Bug] adapter.get_relation returns None for Foreign Table Type #53

Closed nurdin-i closed 11 months ago

nurdin-i commented 1 year ago

Is this a new bug in dbt-core?

Current Behavior

Running adapter.get_relation on table/schema that is a foreign wrapper / imported schema from a different database returns None even when schema/table exists in a DB.

Expected Behavior

get_relation to return an object from a database (if exists) no matter if it's an object of that database or an imported object such as imported schema / foreign wrapper.

Steps To Reproduce

For reference, I will use 2 DBs names in here:

  1. DB1 (main Postgres database that contains schema/table we want to import FROM)
  2. DB2 (second Postgres database we want to import TO)
 create schema if not exists schema_1;
create table schema_1.table_1 (column_1 INT);
insert into schema_1.table_1 (column_1) values (-1);
 --This query to be executed exclusively in DB2.
CREATE EXTENSION postgres_fdw;
  --Create SERVER to reference DB1 you want to import schema from (creating foreign wrapper)
  --Options are entered for the DB1 (ie. db_name in our scenario is 'DB1')
  CREATE SERVER IF NOT EXISTS db1_server
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (
      host 'your DB1 host',
  port '5432',
  dbname 'your DB1 name',
  fetch_size '100000',
  use_remote_estimate 'true'
    );

  -- CREATE user mappings for the server created in STEP 1
  CREATE USER MAPPING IF NOT EXISTS FOR CURRENT_USER
    SERVER  db1_server
    OPTIONS (
      user 'your username',
  password 'your password'
    );

 --Recreate schema in DB2
  DROP schema if exists schema_2 cascade;
  CREATE SCHEMA IF NOT EXISTS schema_2;
  IMPORT FOREIGN SCHEMA schema_1
    FROM SERVER db1_server
    INTO schema_2;

   select *
   from schema_2.table_1;

If we check information schema, we can see foreign table type is created in DB2.

select *
from information_schema.tables
where table_type = 'FOREIGN';

image

You can create a new macro just to test this:

{% macro test_foreign() %}
    {% set my_model= adapter.get_relation("new_db", "schema_2", "table_1") %}
    {{ print("Printing Model") }}
    {{ print(my_model) }}
{% endmacro %}

Once macro is created (new .sql file), you can run it here (make sure you are running it on your DB2 (in my case, it is named new_db))

dbt run-operation test_foreign --target new_db

Results -> None for foreign object type.

Relevant log output

No response

Environment

- OS:Using Windows 
- Python: 3.8.1
- dbt: 1.4.6

Which database adapter are you using with dbt?

postgres

Additional Context

For reference, the table_type is FOREIGN and this type doesn't get recognized when running adapter.get_relation.

image

Note: if this code is used instead of get_relation:

api.Relation.create(schema='elementary', identifier='elementary_test_results')

Object is returned even of foreign type.

dbeatty10 commented 1 year ago

Thanks for raising this @nurdin-i 👍

Could you provide more detail about this portion? If would be helpful if you can provide the exact SQL in a step-by-step manner so we can try to reproduce this with Posgres:

  1. Create a foreign wrapper and import schema from DB1 into DB2.

I'm guessing the solution might end up being bringing something like this into dbt-core.

nurdin-i commented 1 year ago

Thanks for raising this @nurdin-i 👍

Could you provide more detail about this portion? If would be helpful if you can provide the exact SQL in a step-by-step manner so we can try to reproduce this with Posgres:

  1. Create a foreign wrapper and import schema from DB1 into DB2.

I'm guessing the solution might end up being bringing something like this into dbt-core.

Hi @dbeatty10, I just updated the issue with detailed step-by-step process to reproduce this bug. I also reproduced it once again with new DB on my own local machine. Let me know if it all makes sense. Thank you!

dbeatty10 commented 1 year ago

@nurdin-i Thank you for these instructions and sorry for the long delay!

I was able to reproduce what you described, and the following macro override in my local dbt project seems to work:

macros/my_overrides.sql

{% macro postgres__list_relations_without_caching(schema_relation) %}
  {% call statement('list_relations_without_caching', fetch_result=True) -%}
    select
      '{{ schema_relation.database }}' as database,
      tablename as name,
      schemaname as schema,
      'table' as type
    from pg_tables
    where schemaname ilike '{{ schema_relation.schema }}'
    union all
    select
      '{{ schema_relation.database }}' as database,
      viewname as name,
      schemaname as schema,
      'view' as type
    from pg_views
    where schemaname ilike '{{ schema_relation.schema }}'
    union all
    select
      '{{ schema_relation.database }}' as database,
      matviewname as name,
      schemaname as schema,
      'materialized_view' as type
    from pg_matviews
    where schemaname ilike '{{ schema_relation.schema }}'
    union all
    select
      '{{ schema_relation.database }}' as database,
      table_name as name,
      table_schema as schema,
      'external' as type
    from information_schema.tables
    where table_schema ilike '{{ schema_relation.schema }}'
      and table_type = 'FOREIGN'
  {% endcall %}
  {{ return(load_result('list_relations_without_caching').table) }}
{% endmacro %}

Possibility for implementation in dbt-core

Ultimately, the solution might be to add the following right underneath here:

    union all
    select
      '{{ schema_relation.database }}' as database,
      table_name as name,
      table_schema as schema,
      'external' as type
    from information_schema.tables
    where table_schema ilike '{{ schema_relation.schema }}'
      and table_type = 'FOREIGN'

Side note

I don't know why postgres__list_relations_without_caching uses the pg_tables, pg_views, and pg_matviews views instead of just using information_schema.tables across the board :shrug:.

Doing that research would be something to spike out into a separate issue.

github-actions[bot] commented 11 months ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 11 months ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.