Closed sankeerthnagapuri closed 3 months ago
Removed Flake8 config to be able to commit the code. Please let me know the best way to get around flake8 errors (i see some not related to this PR errors as well) so I can add the flake8 config back. Thanks
As the name of the repository said, this is a dbt-athena adapter, mostly build for athena trino SQL capabilities - except for Python models that runs on athena spark.
I do believe that what you propose it's really relevant, but I'm not in favor of incorporating those changes because:
In the future AWS might take over, and might maybe want to create an adapter that allow to use the right engine for the right job - I can envision a dbt-aws
adapter where the user can specify the aws engine and the dialect to use in order run the models against the right technology.
Said so, I would like to ask an opinion to the other maintainers @Jrmyy @jessedobbelaere @mattiamatrix @svdimchenko before closing this PR.
@nicor88 It was also my first thought. Having EMR will bloat the dbt-athena adapter beyond its primary responsibility. E.g. the dbt-glue job also runs Spark jobs but on Glue (Interactive Sessions), ... dbt-athena runs with Spark workgroups on Athena. But Spark has so many ways to run on AWS... EMR (serverless) is a lot more expensive so indeed the integration tests would burn our credits too.
I saw that Redshift has a connector for spark on EMR but no progress in the dbt-redshift adapter
I'm not sure what's the best path forward: let @sankeerthnagapuri create a separate adapter exclusively for spark-emr?
Having a specific adapter for emr / (emr serverless) might be the best option.
I would like to chime in @iconara on this issue to give his point of view.
Hi all, I can only echo @nicor88 and @jessedobbelaere's comments. This is a neat idea, but this is not the right place. Could you reach out to me on Slack (I'm @tolv on https://getdbt.slack.com) and tell me more about how you use EMR and dbt today? We are always working on new features and support for more tools. I especially be interested in knowing how you use Python models with dbt, why EMR-S over Glue.
Based on the comment above from @iconara and @jessedobbelaere I'm closing this PR.
Description
Models used to test - Optional
Added functional tests and also a sample to README
Checklist