Tomme / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
Apache License 2.0
142 stars 79 forks source link

Use union query for listing relations. #12

Closed friendofasquid closed 3 years ago

friendofasquid commented 3 years ago

For whatever reason, this is MUCH faster that then regex + join query

Dandandan commented 3 years ago

Cool! I think it's not weird - union all is way cheaper than a join and can be done in parallel/distributed fashion!

Using lower is a great change too.

Dandandan commented 3 years ago

@friendofasquid what about using lower in athena__get_columns_in_relation for consistency too? The macro ilike could also be removed after this from the code.

friendofasquid commented 3 years ago

Done. I must admit I have not tested either commit against the your spec, so I wouldn't necessarily merge just yet.

friendofasquid commented 3 years ago

I ran the tests and they aren't passing.

_______________________________________________________________________________ usecase: test_dbt_base _______________________________________________________________________________
Unknown error: Got an unexpected relation type of table for relation view_model, expected view
assert <RelationType.Table: 'table'> == 'view'
  - view
  + table in test index 4 (item_type=relation_types)

I can see the tests run the following SQL:

create or replace view
    dbt_athena_dev_210328102104666104722654.view_model
  as

select * from dbt_athena_dev_210328102104666104722654.base

It does suggest there is something wrong with my logic here, but can't see it. This should very come back as a view in the information_schema.views unless it also comes back in the tables table. I will need to check if views end up in the tables table in addition to the views table.

friendofasquid commented 3 years ago

OK, so the views do show up in views and tables, and I had to change the SQL to account for that.

All good now. All tests pass. No idea if this is any faster now, but probably :)

Tomme commented 3 years ago

Hey @friendofasquid, apologies for the slow reply but a big thanks for the contribution. I'm going to run a few benchmarks on your query but I do suspect it will scale better than my previous implementation 👍

friendofasquid commented 3 years ago

So actually I'm pretty sure this is overcooked and you don't need to query the views table at all. Still likely a lot faster than current implementation but could be made better.