Tomme / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
Apache License 2.0
141 stars 79 forks source link

DBT hangs when awsdatacatalog contains many databases #110

Open a-agmon opened 2 years ago

a-agmon commented 2 years ago

Hi! Whenever any DBT command first runs then the following query is run to check the existence of views and tables on the entire catalog. However, when the default catalog is large then it really hangs for a while. Is there a way to speed this up or avoid this? Is there a reason that it has to run over the whole catalog rather than just the relevant schema?

Thanks

WITH views AS (
      select
        table_catalog as database,
        table_name as name,
        table_schema as schema
      from "awsdatacatalog".INFORMATION_SCHEMA.views
      where table_schema = LOWER('*******')
    ), tables AS (
      select
        table_catalog as database,
        table_name as name,
        table_schema as schema

      from "awsdatacatalog".INFORMATION_SCHEMA.tables
      where table_schema = LOWER('********')

      -- Views appear in both `tables` and `views`, so excluding them from tables
      EXCEPT 

      select * from views
    )
    select views.*, 'view' AS table_type FROM views
    UNION ALL
    select tables.*, 'table' AS table_type FROM tables
owenprough-sift commented 2 years ago

Possibly related to #105