Open gbmarc1 opened 12 months ago
Thanks for reporting this @gbmarc1
It sounds like this didn't work for you:
def model(dbt, session):
my_sql_model_df = dbt.source("safe_content_moderation", "safe_content_moderation")
final_df = my_sql_model_df
return final_df
But this did work:
def model(dbt, session):
dbt.config(
submission_method="cluster",
dataproc_cluster_name="ml-adhoc-dataproc-us-central1",
)
my_sql_model_df = dbt.source("safe_content_moderation", "safe_content_moderation")
final_df = my_sql_model_df
return final_df
Did you happen to try either of these as well? This could help nail down where the missing piece(s) might be.
Configuring submission_method
only:
def model(dbt, session):
dbt.config(
submission_method="cluster",
)
my_sql_model_df = dbt.source("safe_content_moderation", "safe_content_moderation")
final_df = my_sql_model_df
return final_df
Or configuring dataproc_cluster_name
only:
def model(dbt, session):
dbt.config(
dataproc_cluster_name="ml-adhoc-dataproc-us-central1",
)
my_sql_model_df = dbt.source("safe_content_moderation", "safe_content_moderation")
final_df = my_sql_model_df
return final_df
Hello, Thanks for looking at this! :)
Seems the profile's submission_method get ignored.
Thanks @gbmarc1 -- that gives us the info we need 👍
As noted in the original issue, dbt should use the cluster
submission method (rather than serverless
) when using the following project files:
profiles.yml
ml:
target: dev
outputs:
dev: &dev_config
type: bigquery
dataset: "{{ env_var('USER') }}"
project: shopify-ml-adhoc
priority: interactive
method: oauth
location: US
job_execution_timeout_seconds: 600
job_retries: 1
threads: 2
submission_method: cluster
dataproc_region: us-central1
gcs_bucket: ml-adhoc-dataproc-jobs
dataproc_cluster_name: ml-adhoc-dataproc-us-central1
models/my_model
def model(dbt, session):
dbt.config(
dataproc_cluster_name="ml-adhoc-dataproc-us-central1",
)
my_sql_model_df = dbt.source("safe_content_moderation", "safe_content_moderation")
final_df = my_sql_model_df
return final_df
Is this a new bug in dbt-bigquery?
Current Behavior
I have the following profile. I want a job to be created in the provided cluster name but it always end up as a serverless batch.
This is the model. If I uncomment the dbt.config it works properly. But I want this config in the profile not in the model itself.
Expected Behavior
The profile config is respected and the job is executed in the cluster.
Steps To Reproduce
dbt run
Relevant log output
Environment
Additional Context
No response