Closed roy-ht closed 8 months ago
Is this PR obsoleted by recently-merged #86?
@owenprough-sift
Is this PR obsoleted by recently-merged #86?
No, I tried master branch that #86 was merged, it still causes an java.lang.RuntimeException
.
So users may want to choose if using Glue API or not, i'll add a configuration option like use_glue_api: bool
.
@VDFaller
88a868a adds the use_glue_api
option
I also had a slowness issue with list_relations_without_caching
, it worked fine for schemas up to 100 tables but was painfully slow for 101+.
Solved it by updating the Athena Engine to version 3.
I've tested schemas with up to 296 tables and they all perform quite well.
Guess it might help you as well @roy-ht
Problem
Our Glue database is huge and often failed to retrieve schema information.
Athena returns this type of error:
More specifically, 2 macros often cause an error, or query is too slow:
Solution
Add override methods:
get_columns_in_relation
list_relations_without_caching
_get_one_catalog
And in these methods, Call glue_get_table and glue_get_tables and get its column information directory.
Effect
dbt docs
, showing more table metadata information got from API like partition keys and locations.