elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.83k stars 152 forks source link

[ELE-47] AWS Glue Integration #177

Open rajkstats opened 1 year ago

rajkstats commented 1 year ago

Requesting integration with Amazon S3 as a data lake

Want to set up data observability on top of input and output datasets.

ELE-47

Maayan-s commented 1 year ago

Hi @rajkstats! Thanks for opening the issue! I'm not familiar with the dbt-glue-adapter, so it's hard to assess how many changes such integration will require. We recently decided (do to demand from the community) to add a Databricks integration, and decided to approach it gradually - Step 1 - add support for uploading dbt artifacts and run results (in the dbt package). Step 2 - add support in the CLI for Slack alerts and UI generation. Step 3 - add support for data anomaly detection test (the most complex and platform-specific part of the code right now).

Here is my PR for step 1 for Databricks, as you can see it actually required pretty minor changes. If you want to give a shot with AWS Glue, I would be happy to support you!

rajkstats commented 1 year ago

Thanks @Maayan-s for sharing the approach, I will give it a shot, let you know if I would need any support. Thanks.

bruno-ribeirodasilva commented 1 year ago

@rajkstats did you do any progress on this?

Maayan-s commented 1 year ago

Hi @bruno-ribeirodasilva, I assume that this issue can be re-assigned. Are you interested in giving it a shot?

rajkstats commented 1 year ago

@bruno-ribeirodasilva I wasn't able to pick this up, but have plans to pick it up. You feel free to give it a shot as @Maayan-s suggested

nandubatchu commented 11 months ago

@Maayan-s did this progress? Do we have a way to use elementary with dbt-glue adapter?