elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.89k stars 159 forks source link

edr send-report ignores custom schema #1462

Open adamcunnington-mlg opened 6 months ago

adamcunnington-mlg commented 6 months ago

Describe the bug I use a pretty typical custom schema setup (a pattern described in dbt docs) whereby dbt invocations against the target prod use the schema set in my project's yml files (the custom schema) but in non-prod, the target schema is used (default schema) and the custom schema is ignored.

My target schema is TRANSFORMED_elementary. I never want this to be used in prod. My custom schema is DBT_ELEMENTARY.

In prod, the dbt invocations reflect this and edr tables correctly build in the DBT_ELEMENTARY dataset but edr send-report seems to not. I don't know if this effects other edr commands too. When the edr send-report command runs, it tries to populated tables in TRANSFORMED_elementary that don't exist. I thought this might be because the target name wasn't respected (but I guess you use the default set for the elementary profile) and so I adapted the custom schema macro logic to explicitly use the custom schema (DBT_ELEMENTARY) when target.profile_name = elementary but it has no effect.

My conclusion is edr send-report only uses the values from profiles.yml and so it's bypassing/ignoring my override within the dbt macro. That doesn't sound quite right though because surely edr send-report is running a dbt run command under the hood?

Please confirm my understanding. Is there a workaround? Seems like a big omission so I assume I'm missing something!

IDoneShaveIt commented 5 months ago

Hey @adamcunnington-mlg, you are right, the edr commands are using only the values from your profiles.yml. The dbt macro that you override is only part of your dbt project. When running edr commands you are running a python package that uses an internal dbt project to access the artifacts elementary dbt package is uploading to the schema you define at your dbt project. The internal dbt project that edr is using is not aware of your dbt macro that set the schema.

My suggestion is to have 2 targets for the elementary profile you create:

  1. Points to the dev schema (TRANSFORMED_elementary)
  2. Points to prod schema (DBT_ELEMENTARY)

Then when you run edr on prod pass the relevant target that points to the right schema 🙂

adamcunnington-mlg commented 5 months ago

Thanks for the response. As I said in slack, for benefit of those looking at this issue, please can I suggest:

oynek commented 4 months ago

@adamcunnington-mlg Yep, came across this problem today as well using Prefect. Should definitely been explained in the docs.