dbt-labs / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
https://dbt-athena.github.io
Apache License 2.0
228 stars 100 forks source link

feat: skip_workgroup_check setting to reduce AWS throttling #713

Closed amacal closed 2 months ago

amacal commented 2 months ago

Description

The adapter performs a GetWorkGroup operation on each thread, which is later cached. When dbt build is started for 10 independent models, it issues 10 AWS requests to get the same information. If dbt is orchestrated via Airflow to run 32 dbt tasks concurrently, it sends 320 GetWorkGroup requests, causing throttling on the AWS side.

This PR introduces a skip_workgroup_check setting, which instructs dbt to skip checking if a WorkGroup contains an enforced output location.

Checklist

nicor88 commented 2 months ago

@amacal implementation looks good.

Please have a look at the ci, precommit checks are failing. Provide unit testing for is_work_group_output_location_enforced function cover the introduced feature.

Nice to have, and not mandatory, add some "integration tests" to fully test the feature in a real environment.

nicor88 commented 2 months ago

@amacal plese remember what I suggested before:

Without all these 3 conditions, we cannot merge. Thanks

amacal commented 2 months ago

I added a unit test and documented new parameter in the documentation. I cannot do functional test, because I cannot push it via org AWS account.

nicor88 commented 2 months ago

@svdimchenko @Jrmyy could you have a look when you have some time? I'm currently away from keyboard. Thanks

nicor88 commented 2 months ago

@amacal implementation looks fine, have a look at the CI, there is a failure

amacal commented 2 months ago

Is there anything I need to do to have it merged?

nicor88 commented 2 months ago

@amacal you changes looks good. As you can see repository ownership is being moved to dbtlabs, they are the official owners. @colin-rogers-dbt could you have a look here?