dbt-labs / dbt-bigquery

dbt-bigquery contains all of the code required to make dbt operate on a BigQuery database.
https://github.com/dbt-labs/dbt-bigquery
Apache License 2.0
217 stars 153 forks source link

[ADAP-769] [Regression] bq jobs labeling doesn't work on 1.6.0 #863

Closed ddovbii closed 1 year ago

ddovbii commented 1 year ago

Is this a new bug in dbt-bigquery?

Current Behavior

Hello!

I have a macro that reads a list of environment variables and builds a dictionary that later is being caught by query-comment.job-label feature.

macros/bq_labels.sql:

{% macro bq_labels() %}{
    "system": "{{ env_var('LABEL_SYSTEM', 'airflow') }}",
    "owner": "{{ env_var('LABEL_OWNER', 'unknown') }}",
    "envtype": "{{ env_var('LABEL_ENV', 'dev') }}",
    "dag_id": "{{ env_var('LABEL_DAG_ID', 'unknown') }}"
}{% endmacro %}

dbt_project.yml:

...
macro-paths:
  - macros
query-comment:
  comment: '{{ bq_labels() }}'
  job-label: true
  append: true

It works perfectly when using dbt-bigquery==1.5.3. When I run the bq show command for my job, I can see the following output in the "Labels" column:

owner:monideep
system:airflow
envtype:local
dbt_invocation_id:58a6c0f6-3001-4c2c-8a73-84847f95662d
dag_id:demo-dag_test_dbt_dag_mde

But when using 1.6.0, it looks like it can't "parse" the macro. The output is the following:

dbt_invocation_id:fbf2d221-dd52-4baf-851a-9de0933689e5
query_comment:___bq_labels_____

Expected Behavior

The labels are parsed correctly, like in 1.5.3

Steps To Reproduce

  1. Create a macro as described
  2. Update dbt_project.yml as described
  3. Run dbt run command

Relevant log output

No response

Environment

- OS: macos 13.4.1, ubuntu 20.04
- Python: 3.8
- dbt-core: 1.6.0
- dbt-bigquery: 1.6.0

Additional Context

No response

dbeatty10 commented 1 year ago

Thanks for reporting this @ddovbii !

I was able to reproduce the regression you reported locally using the following commands (assuming zsh on macOS and that the bq CLI is installed):

### Reprex Activate an environment using dbt-bigquery 1.5 and run: ```shell dbt run --full-refresh last_bq_job_id=$(grep -oE '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}' logs/dbt.log | tail -n 1) bq show --job=true "$last_bq_job_id" ``` Output: ``` Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels ---------- --------- ----------------- ---------------- ------------------------- ----------------- -------------- -------------- -------------------------------------------------------- query SUCCESS 06 Aug 16:31:35 0:00:00.118000 doug.beatty@dbtlabs.com 0 0 0 owner:unknown system:airflow envtype:dev dbt_invocation_id:8b68368d-51b7-4281-8a3a-d50d283f7bd4 dag_id:unknown ``` Activate an environment using dbt-bigquery 1.6 and run: ```shell dbt run --full-refresh last_bq_job_id=$(grep -oE '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}' logs/dbt.log | tail -n 1) bq show --job=true "$last_bq_job_id" ``` Output: ``` Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels ---------- --------- ----------------- ---------------- ------------------------- ----------------- -------------- -------------- -------------------------------------------------------- query SUCCESS 06 Aug 16:53:34 0:00:00.136000 doug.beatty@dbtlabs.com 0 0 0 dbt_invocation_id:4f25a098-8b11-4be5-8025-1dcc17177f09 query_comment:___bq_labels_____ ```
dbeatty10 commented 1 year ago

I didn't confirm or deny, but the changes in #707 related to query_comment might be involved.

aranke commented 1 year ago

I tend to agree with @dbeatty10, specifically it looks like the changes in these lines might be responsible:

https://github.com/dbt-labs/dbt-bigquery/pull/707/files#diff-e4d9ba3b4b6c6c5709431db14344ec0e23226f700e9250f819247ffbb6b112acL423-R424

mikealfare commented 1 year ago

@aranke Are you suggesting that reverting just those two lines would resolve this? Or is there more to it?

mikealfare commented 1 year ago

Reverting those two lines did not resolve the issue. It looks like we're getting the unrendered query comment on profile. I'll keep digging.

mikealfare commented 1 year ago

I've spent a decent amount of time troubleshooting this and could not find the root cause. I was able to write a test that demonstrates this regression in the attached PR. However, when I run that test against 1.5.3, the functionality it calls doesn't work either; it actually throws an exception from attempting to run the underlying code. My guess is that something changed in one of the dependencies (google-cloud-* or dbt-core) and we're picking up a newer patch version that introduces a case that was never tested.

@dbeatty10 When you tested this on 1.5, did you use a fresh install, or did you reuse an environment that you already had? If it's the former, that would rule out my hypothesis. If it's the latter, then I'm really not sure how this was ever working on 1.5.3, despite knowing that both you and the OP got it to run successfully. In the meantime, could you run a pip list on your 1.5 and 1.6 environments and throw them up here? That would help us rule out packages.

@Fleid I think we need to slot this as sprint work as it would involve testing different combinations of dependencies to reproduce. Let me know what you think.

Fleid commented 1 year ago

@mikealfare fair, and thanks a lot for digging into it. Let's see what @dbeatty10 has to say about this ;)

dbeatty10 commented 1 year ago

@mikealfare It's no longer working in 1.4.x, 1.5.x, or 1.6.x for me. I don't recall if I updated my virtual environments related to 1.5.x since my reprex or not 🤷 See below for pip list and pip freeze and my most recent output across those three environments.

### Reprex Activate an environment using dbt-bigquery 1.4 ```shell deactivate source ~/projects/environments/bigquery_1.4.0/bin/activate dbt --version python -m pip list > pip_list_bigquery_1.4 python -m pip freeze > requirements_bigquery_1.4.txt ``` ```shell python -m pip | pbcopy ``` ``` Package Version ------------------------ --------- agate 1.6.3 attrs 22.2.0 Babel 2.11.0 betterproto 1.2.5 cachetools 5.3.0 certifi 2022.12.7 cffi 1.15.1 charset-normalizer 3.0.1 click 8.1.3 colorama 0.4.6 dbt-bigquery 1.4.0 dbt-core 1.4.1 dbt-extractor 0.4.1 future 0.18.3 google-api-core 2.11.0 google-auth 2.16.0 google-cloud-bigquery 3.4.2 google-cloud-core 2.3.2 google-cloud-dataproc 5.3.0 google-cloud-storage 2.7.0 google-crc32c 1.5.0 google-resumable-media 2.4.1 googleapis-common-protos 1.58.0 grpcio 1.51.1 grpcio-status 1.48.2 grpclib 0.4.3 h2 4.1.0 hologram 0.0.15 hpack 4.0.0 hyperframe 6.0.1 idna 3.4 isodate 0.6.1 Jinja2 3.1.2 jsonschema 3.2.0 leather 0.3.4 Logbook 1.5.3 MarkupSafe 2.1.2 mashumaro 3.3.1 minimal-snowplow-tracker 0.0.2 msgpack 1.0.4 multidict 6.0.4 networkx 2.8.8 packaging 21.3 parsedatetime 2.4 pathspec 0.10.3 pip 23.2.1 proto-plus 1.22.2 protobuf 3.20.3 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycparser 2.21 pyparsing 3.0.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-slugify 7.0.0 pytimeparse 1.1.8 pytz 2022.7.1 PyYAML 6.0 requests 2.28.2 rsa 4.9 setuptools 58.1.0 six 1.16.0 sqlparse 0.4.3 stringcase 1.2.0 text-unidecode 1.3 typing_extensions 4.4.0 urllib3 1.26.14 Werkzeug 2.2.2 ``` ```shell python -m pip freeze | pbcopy ``` ``` agate==1.6.3 attrs==22.2.0 Babel==2.11.0 betterproto==1.2.5 cachetools==5.3.0 certifi==2022.12.7 cffi==1.15.1 charset-normalizer==3.0.1 click==8.1.3 colorama==0.4.6 dbt-bigquery==1.4.0 dbt-core==1.4.1 dbt-extractor==0.4.1 future==0.18.3 google-api-core==2.11.0 google-auth==2.16.0 google-cloud-bigquery==3.4.2 google-cloud-core==2.3.2 google-cloud-dataproc==5.3.0 google-cloud-storage==2.7.0 google-crc32c==1.5.0 google-resumable-media==2.4.1 googleapis-common-protos==1.58.0 grpcio==1.51.1 grpcio-status==1.48.2 grpclib==0.4.3 h2==4.1.0 hologram==0.0.15 hpack==4.0.0 hyperframe==6.0.1 idna==3.4 isodate==0.6.1 Jinja2==3.1.2 jsonschema==3.2.0 leather==0.3.4 Logbook==1.5.3 MarkupSafe==2.1.2 mashumaro==3.3.1 minimal-snowplow-tracker==0.0.2 msgpack==1.0.4 multidict==6.0.4 networkx==2.8.8 packaging==21.3 parsedatetime==2.4 pathspec==0.10.3 proto-plus==1.22.2 protobuf==3.20.3 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser==2.21 pyparsing==3.0.9 pyrsistent==0.19.3 python-dateutil==2.8.2 python-slugify==7.0.0 pytimeparse==1.1.8 pytz==2022.7.1 PyYAML==6.0 requests==2.28.2 rsa==4.9 six==1.16.0 sqlparse==0.4.3 stringcase==1.2.0 text-unidecode==1.3 typing_extensions==4.4.0 urllib3==1.26.14 Werkzeug==2.2.2 ``` ```shell dbt run --full-refresh last_bq_job_id=$(grep -oE '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}' logs/dbt.log | tail -n 1) bq show --job=true "$last_bq_job_id" ``` Output: ``` Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels ---------- --------- ----------------- ---------------- ------------------------- ----------------- -------------- -------------- ------------------------------------------------------------------------------- query SUCCESS 11 Aug 07:22:57 0:00:00.157000 doug.beatty@dbtlabs.com 0 0 0 dbt_invocation_id:f2879858-379b-40e2-95a8-888deb9d3f95 query_comment:__system_______env_var__label_system____airflow_________owner__ ``` Activate an environment using dbt-bigquery 1.5 ```shell deactivate source ~/projects/environments/bigquery_1.5/bin/activate dbt --version python -m pip list > pip_list_bigquery_1.5 python -m pip freeze > requirements_bigquery_1.5.txt ``` ```shell python -m pip list | pbcopy ``` ``` Package Version ------------------------- ----------- agate 1.6.3 attrs 23.1.0 Babel 2.12.1 cachetools 5.3.1 certifi 2023.7.22 cffi 1.15.1 charset-normalizer 3.2.0 click 8.1.3 colorama 0.4.6 dbt-bigquery 1.5.5 dbt-core 1.5.4 dbt-extractor 0.4.1 future 0.18.3 google-api-core 2.12.0.dev0 google-auth 2.22.0 google-cloud-bigquery 3.11.4 google-cloud-core 2.3.3 google-cloud-dataproc 5.4.3 google-cloud-storage 2.10.0 google-crc32c 1.5.0 google-resumable-media 2.5.0 googleapis-common-protos 1.60.0 grpc-google-iam-v1 0.12.6 grpcio 1.57.0rc1 grpcio-status 1.57.0rc1 hologram 0.0.16 idna 3.4 importlib-resources 6.0.0 isodate 0.6.1 Jinja2 3.1.2 jsonschema 4.18.6 jsonschema-specifications 2023.7.1 leather 0.3.4 Logbook 1.5.3 MarkupSafe 2.1.3 mashumaro 3.6 minimal-snowplow-tracker 0.0.2 msgpack 1.0.5 networkx 2.8.8 packaging 23.1 parsedatetime 2.4 pathspec 0.11.2 pip 23.2.1 pkgutil_resolve_name 1.3.10 proto-plus 1.22.3 protobuf 4.24.0rc3 pyasn1 0.5.0 pyasn1-modules 0.3.0 pycparser 2.21 python-dateutil 2.8.2 python-slugify 8.0.1 pytimeparse 1.1.8 pytz 2023.3 PyYAML 6.0.1 referencing 0.30.1 requests 2.31.0 rpds-py 0.9.2 rsa 4.9 setuptools 56.0.0 six 1.16.0 sqlparse 0.4.4 text-unidecode 1.3 typing_extensions 4.7.1 urllib3 1.26.16 Werkzeug 2.3.6 zipp 3.16.2 ``` ```shell python -m pip freeze | pbcopy ``` ``` agate==1.6.3 attrs==23.1.0 Babel==2.12.1 cachetools==5.3.1 certifi==2023.7.22 cffi==1.15.1 charset-normalizer==3.2.0 click==8.1.3 colorama==0.4.6 dbt-bigquery==1.5.5 dbt-core==1.5.4 dbt-extractor==0.4.1 future==0.18.3 google-api-core==2.12.0.dev0 google-auth==2.22.0 google-cloud-bigquery==3.11.4 google-cloud-core==2.3.3 google-cloud-dataproc==5.4.3 google-cloud-storage==2.10.0 google-crc32c==1.5.0 google-resumable-media==2.5.0 googleapis-common-protos==1.60.0 grpc-google-iam-v1==0.12.6 grpcio==1.57.0rc1 grpcio-status==1.57.0rc1 hologram==0.0.16 idna==3.4 importlib-resources==6.0.0 isodate==0.6.1 Jinja2==3.1.2 jsonschema==4.18.6 jsonschema-specifications==2023.7.1 leather==0.3.4 Logbook==1.5.3 MarkupSafe==2.1.3 mashumaro==3.6 minimal-snowplow-tracker==0.0.2 msgpack==1.0.5 networkx==2.8.8 packaging==23.1 parsedatetime==2.4 pathspec==0.11.2 pkgutil_resolve_name==1.3.10 proto-plus==1.22.3 protobuf==4.24.0rc3 pyasn1==0.5.0 pyasn1-modules==0.3.0 pycparser==2.21 python-dateutil==2.8.2 python-slugify==8.0.1 pytimeparse==1.1.8 pytz==2023.3 PyYAML==6.0.1 referencing==0.30.1 requests==2.31.0 rpds-py==0.9.2 rsa==4.9 six==1.16.0 sqlparse==0.4.4 text-unidecode==1.3 typing_extensions==4.7.1 urllib3==1.26.16 Werkzeug==2.3.6 zipp==3.16.2 ``` ```shell dbt run --full-refresh last_bq_job_id=$(grep -oE '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}' logs/dbt.log | tail -n 1) bq show --job=true "$last_bq_job_id" ``` ``` Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels ---------- --------- ----------------- ---------------- ------------------------- ----------------- -------------- -------------- -------------------------------------------------------- query SUCCESS 11 Aug 07:38:06 0:00:00.138000 doug.beatty@dbtlabs.com 0 0 0 dbt_invocation_id:71ac120a-a308-43ca-9db5-5b08dfe5f3fe query_comment:___bq_label_macro_name_____ ``` Activate an environment using dbt-bigquery 1.6 ```shell deactivate source ~/projects/environments/bigquery_1.6/bin/activate dbt --version python -m pip list > pip_list_bigquery_1.6 python -m pip freeze > requirements_bigquery_1.6.txt ``` ```shell python -m pip list | pbcopy ``` ``` Package Version ------------------------ --------- agate 1.7.1 attrs 23.1.0 Babel 2.12.1 cachetools 5.3.1 certifi 2023.5.7 cffi 1.15.1 charset-normalizer 3.1.0 click 8.1.3 colorama 0.4.6 dbt-bigquery 1.6.1 dbt-core 1.6.0 dbt-extractor 0.4.1 dbt-semantic-interfaces 0.2.0 future 0.18.3 google-api-core 2.11.1 google-auth 2.20.0 google-cloud-bigquery 3.11.1 google-cloud-core 2.3.2 google-cloud-dataproc 5.4.1 google-cloud-storage 2.9.0 google-crc32c 1.5.0 google-resumable-media 2.5.0 googleapis-common-protos 1.59.1 grpc-google-iam-v1 0.12.6 grpcio 1.56.0rc2 grpcio-status 1.56.0rc2 hologram 0.0.16 idna 3.4 importlib-metadata 6.6.0 isodate 0.6.1 Jinja2 3.1.2 jsonschema 3.2.0 leather 0.3.4 Logbook 1.5.3 MarkupSafe 2.1.3 mashumaro 3.8.1 minimal-snowplow-tracker 0.0.2 more-itertools 8.10.0 msgpack 1.0.5 networkx 2.8.8 packaging 23.1 parsedatetime 2.4 pathspec 0.11.1 pip 23.2.1 proto-plus 1.22.2 protobuf 4.23.3 pyasn1 0.5.0 pyasn1-modules 0.3.0 pycparser 2.21 pydantic 1.10.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-slugify 8.0.1 pytimeparse 1.1.8 pytz 2023.3 PyYAML 6.0 requests 2.31.0 rsa 4.9 setuptools 56.0.0 six 1.16.0 sqlparse 0.4.3 text-unidecode 1.3 typing_extensions 4.6.3 urllib3 1.26.16 Werkzeug 2.3.6 zipp 3.15.0 ``` ```shell python -m pip freeze | pbcopy ``` ``` agate==1.7.1 attrs==23.1.0 Babel==2.12.1 cachetools==5.3.1 certifi==2023.5.7 cffi==1.15.1 charset-normalizer==3.1.0 click==8.1.3 colorama==0.4.6 dbt-bigquery==1.6.1 dbt-core==1.6.0 dbt-extractor==0.4.1 dbt-semantic-interfaces==0.2.0 future==0.18.3 google-api-core==2.11.1 google-auth==2.20.0 google-cloud-bigquery==3.11.1 google-cloud-core==2.3.2 google-cloud-dataproc==5.4.1 google-cloud-storage==2.9.0 google-crc32c==1.5.0 google-resumable-media==2.5.0 googleapis-common-protos==1.59.1 grpc-google-iam-v1==0.12.6 grpcio==1.56.0rc2 grpcio-status==1.56.0rc2 hologram==0.0.16 idna==3.4 importlib-metadata==6.6.0 isodate==0.6.1 Jinja2==3.1.2 jsonschema==3.2.0 leather==0.3.4 Logbook==1.5.3 MarkupSafe==2.1.3 mashumaro==3.8.1 minimal-snowplow-tracker==0.0.2 more-itertools==8.10.0 msgpack==1.0.5 networkx==2.8.8 packaging==23.1 parsedatetime==2.4 pathspec==0.11.1 proto-plus==1.22.2 protobuf==4.23.3 pyasn1==0.5.0 pyasn1-modules==0.3.0 pycparser==2.21 pydantic==1.10.9 pyrsistent==0.19.3 python-dateutil==2.8.2 python-slugify==8.0.1 pytimeparse==1.1.8 pytz==2023.3 PyYAML==6.0 requests==2.31.0 rsa==4.9 six==1.16.0 sqlparse==0.4.3 text-unidecode==1.3 typing_extensions==4.6.3 urllib3==1.26.16 Werkzeug==2.3.6 zipp==3.15.0 ``` ```shell dbt run --full-refresh last_bq_job_id=$(grep -oE '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}' logs/dbt.log | tail -n 1) bq show --job=true "$last_bq_job_id" ``` ``` Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels ---------- --------- ----------------- ---------------- ------------------------- ----------------- -------------- -------------- -------------------------------------------------------- query SUCCESS 11 Aug 07:37:17 0:00:00.186000 doug.beatty@dbtlabs.com 0 0 0 dbt_invocation_id:a2723e58-86f8-421d-9992-bd0e7496bc68 query_comment:___bq_label_macro_name_____ ```
mikealfare commented 1 year ago

That's interesting, I wonder why we're getting pre-releases in our final releases:

I'll look at the 1.5.3 bundle and compare, since that presumably worked at the time. The fact that it worked earlier on your venv and now it doesn't suggests to me that it's dependency versions shifting underneath us.

mikealfare commented 1 year ago

Here's what we have for the prerelease bundle: https://github.com/dbt-labs/dbt-core-bundles/blob/main/release_creation/bundle/requirements/v1.5.pre.requirements.txt

And for the initial final release bundle: https://github.com/dbt-labs/dbt-core-bundles/blob/29c6e2ee14ddafe846d0c3f4eec2122da39ee266/release_creation/snapshot/requirements/v1.5.latest.requirements.txt

And for the homebrew formula for 1.5.1 (we released later minors yesterday, so they wouldn't be valid snapshots): https://github.com/dbt-labs/homebrew-dbt/blob/main/Formula/dbt-bigquery%401.5.1.rb

dbeatty10 commented 1 year ago

I'm not currently creating my local virtual environments using bundles, but maybe I should switch over?

Instead, I've been using commands like this so that I can install and try out betas for dbt-bigquery 1.7 without needing to change my instructions once it is a final release. But I'm guessing that the --pre is allowing prereleases of dependencies to be installed as well?

python3 -m venv bigquery_1.5
source bigquery_1.5/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade --pre dbt-bigquery~=1.5.0.dev0 dbt-core~=1.5.0.dev0
source bigquery_1.5/bin/activate
dbt --version
deactivate
mikealfare commented 1 year ago

Oh, yes, the --pre is why we're getting pre-releases. Ok, I'm less concerned about that now. I know we're moving towards bundles in general for end use because it's a repeatable installation. Having another person try it out early would be useful for feedback. Also, we do generate bundles for prereleases, but I think they need to be at the RC stage. So if you're looking to try out 1.7.0b1 or 1.7.0a1, you probably still need to do what you're doing.

RGreen90 commented 1 year ago

Is there any updates on the above issue or corresponding PR please?

RGreen90 commented 1 year ago

@mikealfare @dbeatty10 I have worked on this in a local environment and have been able to get it working again. I believe that these two lines are the culprit and reverted them back to before the change mentioned to get the job-labels working again.

Firstly, this change talks about adding a limit to the execute signature to align with dbt-core; which in itself is absolutely fine. But I don't think changing these two lines were necessary to achieve that and an oversight given the below reasoning.

The implementation of query_header.comment.query_comment is different to profile.query_comment in that the query_header version implements MacroQueryStringSetter which seems to be performing some security checks around SQL injection within a query comment after obtaining it from the profile.query_comment rather than just pulling directly. It seems odd to have removed these checks?

I feel I PR to change the two lines in question back to their original state will fix things. What do people think?

mikealfare commented 1 year ago

@RGreen90 These are the same two lines that @aranke had also pointed out above. I reverted those two lines, but that didn't work due to methods and/or properties not existing. In other words, it seems like something else changed related to items in those two lines, and those two lines cannot be reverted in isolation. Perhaps I'm misunderstanding something. Feel free to throw up the PR and I'll review it.

RGreen90 commented 1 year ago

@mikealfare

Yes I realised this when I checked over the dev init and it failed the mypy checks. It's weird, so I was initially testing this in a poetry managed env using the poetry run pip install -e /path and things were working with just changing those two lines. Will dig a bit further!

affsantos commented 1 year ago

Is there any plans to fix this?

dataders commented 1 year ago

hi @affsantos thanks for the bump. @mikealfare was able to create a test case that demonstrates the regression, but we haven't had time yet to iron it out.

our current plan is to dedicate capacity for fixing it in a current sprint. in the meantime, you're welcome to branch off of #872 to get the test passing. maybe @github-christophe-oudar and @RGreen90 have some ideas?

RGreen90 commented 1 year ago

@dataders I feel like this regression has been possibly been introduced somewhere in core. As previously mentioned in the thread, these two lines, previously inherited from a query header object, but now take their data from the profile object.

So whatever the _labels_from_query_comment function is trying to do is no longer receiving correctly formatted data?

But just a thought...

github-christophe-oudar commented 1 year ago

I ran into that bug few weeks ago and I'm actually thinking about pushing information in the query-comment with a macro and parsing the output instead of labels so far. It's no convenient but it's a workaround.

Regarding actually fixing that issue, from what I saw, if you're using a jinja macro in the query-comment, it won't be evaluated resulting in the weird values ( '{{ bq_labels() }}' becoming query_comment:___bq_labels_____ instead of a list).

I guess there was some code that was doing macro evaluation in the string and then splitting that values... but I don't know where it is (was?). I didn't really looked into it as I'm busy planning "duck" stuff 🦆

kodaho commented 1 year ago

Hi, Based on the previous investigations on this issue and by exploring the code of dbt-core and dbt-bigquery, I might have found a fix with #955 (I basically reverted the 2 lines previously identified and added checks to make sure the value is not null).

@mikealfare Given that I copy-pasted your test highlighting the regression with a minor modification (the labels should be a valid JSON so I removed the trailing comma), I included you in the Changie entry.

Let me know if you think this solution isn't valid.