Closed haleemur closed 1 year ago
Hi @haleemur ,
I tried your updated catalog script with both dbt 0.17.2 and 0.18.1 and managed to successfully generate the docs. However, the docs are missing columns that are not specified in schema.yml files. And the columns that are defined in the schema files are missing column types if the model is a view. For materialized models the column types are shown correctly. There's probably something wrong with combining the catalog data with the data from schema.yml files. Do you happen to know anything about this?
EDIT: I checked the generated manifest.json
and it only contained my seed nodes. Further investigation revealed that
select
*
from information_schema.columns
does not contain any of my view models. The views and their columns are available in the Glue console though, so the metadata exists but not in information_schema.columns
...
@smomni yes, i noticed that as well & had made a note of it at the end of my pull request's message
limitation with athena execution engine version 1 information.columns does not seem to contain information about views. This seems to have been fixed in engine v2
this PR does not attempt to fix that issue as it is a bug with athena, which is resolved in v2. I had considered looking up the glue catalog directly, but that adds additional complexity, the need for which will go away as aws rolls out the newer engine across more regions.
thank you for testing that this branch works with dbt versions 0.17.2 & 0.18.1. that has been sitting on the back of my mind for a while.
execution engine 2 is being rolled out in us-east-1 (and a few other regions) currently. soon, it will become the default engine athena uses worldwide.
there are some sql-parsing incompatibilities between the two versions when tables are joined like this
in version 1, it was necessary to specify
col1
as eithertable1.col1
ortable2.col1
in the where clause or select clause. however in version 2, that generates an errorto ensure compatibility across both versions, the simplest thing to do is to avoid specifying the join as
using (col1, col2)
this pattern is currently used in fetching the database metadata when executing
dbt docs generate
, so I was not able to build docs against athena running on execution engine version 2.with the change,
dbt docs generate
works successfully across both engine versions.limitation with athena execution engine version 1
information.colunms
does not seem to contain information about views. This seems to have been fixed in engine v2