Closed JustasCe closed 2 years ago
@Tomme would you be up for having something like this in dbt-athena
? The situation around large amounts of tables is really painful in practice.
Loving this implementation for the issue at hand! All tested in my environment(s) and happy for it to get merged in 👍
fixes https://github.com/Tomme/dbt-athena/issues/79
This PR addresses the same issue and makes the same changes as https://github.com/Tomme/dbt-athena/pull/80 with the comments also addressed. This PR is from a new fork because we can maintain it.
Overview
Currently the
athena__get_catalog
macro query hangs indefinitely when trying to generate docs for a database with more than 100 tables, these changes fix the issue by specifying which tables to fetch per each database and batches to a maximum of 100 tables.What's changed
It overrides the default
BaseAdapter._get_catalog_schemas()
function with a customAthenaAdapter._get_catalog_schemas()
, which instead of usingSchemaSearchMap
now uses a customAthenaSchemaSearchMap
. TheSchemaSearchMap.add()
only returns a dictionary with values being a set of database names, the newAthenaSchemaSearchMap.add()
returns a dictionary of dictionaries, where each dictionary key is a database name and the value is a set of the tables in the database._get_one_catalog
is also updated with the correct typing.Using the new
AthenaSchemaSearchMap
theathena__get_catalog
macro now batches the queries to do a maximum select of 100 tables per database per each union.