astronomer / astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
https://astronomer.github.io/astronomer-cosmos/
Apache License 2.0
767 stars 170 forks source link

[Feature] Store dbt artifacts (namely, manifest.json, catalog.json and run_results.json) after every dbt run. #1292

Open victormacaubas opened 3 weeks ago

victormacaubas commented 3 weeks ago

Description

Hi all,

I’ve noticed some similar requests, but I wanted to ask if it’s possible to persist dbt artifacts (namely, manifest.json, catalog.json and run_results.json) permanently after every run, regardless of whether we use TestBehavior.AFTER_ALL or TestBehavior.AFTER_EACH.

Use case/motivation

I’d like to be able to send these run results to both Metaplane and Atlan for data observability analytics.

Related issues

https://github.com/astronomer/astronomer-cosmos/issues/1253 https://github.com/astronomer/astronomer-cosmos/issues/801

Are you willing to submit a PR?

pankajkoti commented 3 weeks ago

hi @victormacaubas , thanks for requesting this feature. Would it help if we start uploading these to a remote_target_path? We already introduced this config in PR #1224 but at the moment we only upload files in the target -> compileddirectory to the remote_target_path & only in case of ExecutionMode.AIRFLOW_ASYNC. We just logged an issue #1293 and would appreciate if you'd have any comment there or here highlighting how these could help you & if we start supporting these what files are more particularly of importance. Would also be nice if you would have the time to help contribute supporting one or two of these :)

victormacaubas commented 3 weeks ago

Hello @pankajkoti

It would be even better if these artifacts could be persisted in a remote_target_path, like S3! I’m currently running in ExecutionMode.LOCAL and using the parsing method with manifest.json (since my project is quite large, using dbt_ls was problematic). I need to send these results to Atlan and Metaplane for alerting and data observability, and storing them externally would be ideal.