Open jelstongreen opened 1 year ago
The only similar thing I have found is: https://github.com/dbt-labs/dbt-core/issues/7312
Thanks for reporting @jelstongreen! Let me look into this one. I've seen this before when invoking dbt runner, but i can't remember what the fix was 🤔.
@jelstongreen this typically happens when you have some nested macro evaluation hidden somewhere, because that triggers the full dbt jinja environment rendering.
Do you happen to have any custom macros embedded in your profiles.yml, or non-standard path configuration overrides for your project or profiles paths? We should support env_var() but anything more than that might cause this behavior, and our path requirements are a bit stricter than dbt's. The API we're relying on for dbt adapter initialization does not use the full macro/project aware runtime config, so we can't run all of the flexible parsing stuff dbt normally provides.
If you run dbt debug
from that project root, what output do you see?
We definitely have some non-standard path configurations as we have a multi-project repo with some upper directories re-used in multiple projects using model-paths. Something like:
model-paths:
- models
- ../../plugins/plugin_a/core/models
- ../../plugins/plugin_a/oegb/models
- ../../plugins/core/utils/models
We also have macros in the project.yml like:
# Elementary vars stopping artefact upload on dev schemas
disable_run_results: "{{ (target.name not in ['prod','circleci_prod']) | as_bool }}"
disable_tests_results: "{{ (target.name not in ['prod','circleci_prod']) | as_bool }}"
disable_dbt_artifacts_autoupload: "{{ (target.name not in ['prod','circleci_prod']) | as_bool }}"
disable_dbt_invocation_autoupload: "{{ (target.name not in ['prod','circleci_prod']) | as_bool }}"
And some use of env_var
in profiles.yml
oegb:
outputs:
dev:
host: my_host.cloud.databricks.com
http_path: /sql/1.0/endpoints/xxx
schema: "{{ env_var('DATABRICKS_TARGET') }}"
threads: 20
token: xxx
catalog: "{{ env_var('DATABRICKS_CATALOG', 'hive_metastore') }}"
type: databricks
retry_all: true
connect_timeout: 5
connect_retries: 3
Would that cause the issue maybe?
@jelstongreen Let me see if I can repro.
@jelstongreen we haven't been able to repro with the macros - we use env_var in our CI and @Jstein77 added some target.name
macro references to our template project and everything seemed to just work. Do you have any others in there? Or do you, by any chance, have a local override of env_var() for some reason?
I'm not familiar enough with dbt core to know if multi-project model path resolution would trigger this issue. It might. We'll dig into this bit further and let you know what we come up with. If you feel ambitious, and if it's even possible, you might try removing the multi-path just for a local run and see if you can get the project info, but I suspect that'll break for other reasons.
One thing you might try - if you're able to modify the code in the metricflow package, update our dbtRunner invocation to use parse
instead of debug
you can run it and see if that solves your problem. Since the original error is throwing out of load_profile it might just work, because parse should initialize the full macro suite while debug does not.
If dbt parse works we might be able to provide an option to do a full parse on demand, and that'd at least give you a work-around (while also improving quality of life for addressing the "I forgot to run dbt parse after updating my YAML configs and now everything is broken" scenario).
Hi all, just to jump on to the thread I have been able to reproduce the errors that @jelstongreen described, where dbt-metricflow is unable to reference any of the global flags when loading the profile. I believe it is directly related to our project structure where we have multiple dbt projects within the datalake-models/projects/
folder where datalake-models
is the root of our project. As I am able to run the metricflow mf
commands from the root directory as expected if I move an existing project file and a pre-generated semantic manifest to the root directory, as follows:
datalake-models/dbt_project.yml
datalake-models/target/semantic_manifest.json
Result:
(datalake-models) mady.daby@OCTO-MAC-LWDQY6LWNP datalake-models % mf list metrics
✔ 🌱 We've found 1 metrics.
The list below shows metrics in the format of "metric_name: list of available dimensions"
• device_count_metric: core_device_status__date_day, core_device_status__device_type, core_device_status__lifecycle_status, core_device_status__provider, core_device_status__time_of_day and 1 more
(datalake-models) mady.daby@OCTO-MAC-LWDQY6LWNP datalake-models % mf list dimensions --metric
s device_count_metric
✔ 🌱 We've found 6 common dimensions for metrics ['device_count_metric'].
• core_device_status__date_day
• core_device_status__device_type
• core_device_status__lifecycle_status
• core_device_status__provider
• core_device_status__time_of_day
• metric_time
Therefore I don't think it is do with the macros and relative paths that we have in our dbt_project.yml, but the structure of our project.
Ideally we would like to be able to run the mf
commands from the root directory but also specifying a project dir as an argument like we do with the dbt
commands e.g. dbt clean --project-dir projects/${PROJECT}/
.
Also, modifying the metricflow package to run a parse instead of debug resulted in the same error messages.
@MadyDaby ah, thank you for the investigation and update with repro cases. @Jstein77 I think this might resolve to the same issue as #643 in which case we should merge them, provide more detail on what we need for #643 (flags, basically), and have that open for contribution.
If this is separate we can open this up - adding flags to the CLI for this stuff shouldn't be too hard today, it's more a question of where they belong and what the command structure itself should look like.
Is this a new bug in metricflow?
Current Behavior
When running any
mf
command I receive an error:If I comment out this section of code in the package I get a similar error for profiles:
Expected Behavior
Mf to work
Steps To Reproduce
11:00:25 Running with dbt=1.6.0 11:00:25 dbt version: 1.6.0 11:00:25 python version: 3.11.2 11:00:26 adapter type: databricks 11:00:26 adapter version: 1.6.1
mf list dimensions --metrics charged_consumption
Relevant log output
No response
Environment
Which database are you using?
other (mention it in "Additional Context")
Additional Context
Databricks