dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
285 stars 115 forks source link

[feature] Support AWS Athena #274

Open dataders opened 3 months ago

dataders commented 3 months ago

Describe the feature

A clear and concise description of what you want to happen.

### Tasks
- [ ] choose which PR should be merged
- [ ] migrate said to use GHA instead of CircleCI
- [ ] someone passes Anders some secrets
- [ ] tests pass and we merge?
dataders commented 3 months ago

@nicor88 @daniel-cortez-stevenson @brabster @aidan-o-boyle-kroo I'd love to work with y'all to get this support in. In the above issue I linked what I understand are the steps needed to make this happen.

Am I wrong to assume we should take #203 instead of #133? Can someone restructure the PR to work off of the new GHA method? Can someone offer an Athena instance and corresponding env vars?

brabster commented 3 months ago

Hey @dataders - the other PR is based on https://github.com/Tomme/dbt-athena, which I believe is defunct but the "maintainer" is unresponsive. I've merged #203 up to date - I'll see what I can do about GHA (GitHub actions?) and bringing things up to date but happy to bow out and close it in favour of something a dbt-athena-community maintainer prefers.

@nicor88 do you folks have a suitable AWS account to use for integration testing? I'm not in a position to provide one.

brabster commented 2 months ago

The need to keep adding adapter-specific code, tests and infra support to this central repo seems like a big maintenance problem.

I have proposed an independent Athena implementation using dbt-external-tables as a generic interface but avoiding the need to merge Athena-specific code into this repo here https://github.com/brabster/dbt-athena-external-tables. Seems pretty lean and straightforward to use in my testing (example project in the repo). Should allow the dbt-athena maintainers to maintain it independently.

I would ask for consideration and feedback on this approach as a way of supporting more platforms without bloating this central repo, to inform dbt-athena-community maintainers decision on https://github.com/dbt-athena/dbt-athena/issues/633, thanks

nicor88 commented 2 months ago

The need to keep adding adapter-specific code, tests and infra support to this central repo seems like a big maintenance problem

I totally agree @brabster. What you proposed seems a good option. I would like to hear the opinion of @dataders. Adding athena support in dbt-external-tables could make sense for example if dbt labs plan to add support for athena in dbt cloud.

Regarding the Infra for the CI, we can use the same aws account that we use for dbt-athena-community (we are planning to apply for Extra AWS credits). But I won't like to use such Infra in a repo thar we (mainteners) cannot control directly, or that is exposed to way more platforms that are not athena related.

Said so, @brabster if we decide to proceed with what you proposed will you be up to transfer ownership of https://github.com/brabster/dbt-athena-external-tables to dbt-athena github org?

brabster commented 2 months ago

Yes very happy to transfer ownership 👍 @nicor88 thanks for your input