Closed ddovbii closed 1 year ago
Thanks for reporting this @ddovbii !
I was able to reproduce the regression you reported locally using the following commands (assuming zsh
on macOS and that the bq
CLI is installed):
I didn't confirm or deny, but the changes in #707 related to query_comment
might be involved.
I tend to agree with @dbeatty10, specifically it looks like the changes in these lines might be responsible:
@aranke Are you suggesting that reverting just those two lines would resolve this? Or is there more to it?
Reverting those two lines did not resolve the issue. It looks like we're getting the unrendered query comment on profile
. I'll keep digging.
I've spent a decent amount of time troubleshooting this and could not find the root cause. I was able to write a test that demonstrates this regression in the attached PR. However, when I run that test against 1.5.3, the functionality it calls doesn't work either; it actually throws an exception from attempting to run the underlying code. My guess is that something changed in one of the dependencies (google-cloud-* or dbt-core
) and we're picking up a newer patch version that introduces a case that was never tested.
@dbeatty10 When you tested this on 1.5, did you use a fresh install, or did you reuse an environment that you already had? If it's the former, that would rule out my hypothesis. If it's the latter, then I'm really not sure how this was ever working on 1.5.3, despite knowing that both you and the OP got it to run successfully. In the meantime, could you run a pip list
on your 1.5 and 1.6 environments and throw them up here? That would help us rule out packages.
@Fleid I think we need to slot this as sprint work as it would involve testing different combinations of dependencies to reproduce. Let me know what you think.
@mikealfare fair, and thanks a lot for digging into it. Let's see what @dbeatty10 has to say about this ;)
@mikealfare It's no longer working in 1.4.x, 1.5.x, or 1.6.x for me. I don't recall if I updated my virtual environments related to 1.5.x since my reprex or not 🤷 See below for pip list
and pip freeze
and my most recent output across those three environments.
That's interesting, I wonder why we're getting pre-releases in our final releases:
I'll look at the 1.5.3 bundle and compare, since that presumably worked at the time. The fact that it worked earlier on your venv and now it doesn't suggests to me that it's dependency versions shifting underneath us.
Here's what we have for the prerelease bundle: https://github.com/dbt-labs/dbt-core-bundles/blob/main/release_creation/bundle/requirements/v1.5.pre.requirements.txt
And for the initial final release bundle: https://github.com/dbt-labs/dbt-core-bundles/blob/29c6e2ee14ddafe846d0c3f4eec2122da39ee266/release_creation/snapshot/requirements/v1.5.latest.requirements.txt
And for the homebrew formula for 1.5.1 (we released later minors yesterday, so they wouldn't be valid snapshots): https://github.com/dbt-labs/homebrew-dbt/blob/main/Formula/dbt-bigquery%401.5.1.rb
I'm not currently creating my local virtual environments using bundles, but maybe I should switch over?
Instead, I've been using commands like this so that I can install and try out betas for dbt-bigquery 1.7 without needing to change my instructions once it is a final release. But I'm guessing that the --pre
is allowing prereleases of dependencies to be installed as well?
python3 -m venv bigquery_1.5
source bigquery_1.5/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade --pre dbt-bigquery~=1.5.0.dev0 dbt-core~=1.5.0.dev0
source bigquery_1.5/bin/activate
dbt --version
deactivate
Oh, yes, the --pre is why we're getting pre-releases. Ok, I'm less concerned about that now. I know we're moving towards bundles in general for end use because it's a repeatable installation. Having another person try it out early would be useful for feedback. Also, we do generate bundles for prereleases, but I think they need to be at the RC stage. So if you're looking to try out 1.7.0b1 or 1.7.0a1, you probably still need to do what you're doing.
Is there any updates on the above issue or corresponding PR please?
@mikealfare @dbeatty10 I have worked on this in a local environment and have been able to get it working again. I believe that these two lines are the culprit and reverted them back to before the change mentioned to get the job-labels working again.
Firstly, this change talks about adding a limit to the execute signature to align with dbt-core
; which in itself is absolutely fine. But I don't think changing these two lines were necessary to achieve that and an oversight given the below reasoning.
The implementation of query_header.comment.query_comment
is different to profile.query_comment
in that the query_header
version implements MacroQueryStringSetter
which seems to be performing some security checks around SQL injection within a query comment after obtaining it from the profile.query_comment
rather than just pulling directly. It seems odd to have removed these checks?
I feel I PR to change the two lines in question back to their original state will fix things. What do people think?
@RGreen90 These are the same two lines that @aranke had also pointed out above. I reverted those two lines, but that didn't work due to methods and/or properties not existing. In other words, it seems like something else changed related to items in those two lines, and those two lines cannot be reverted in isolation. Perhaps I'm misunderstanding something. Feel free to throw up the PR and I'll review it.
@mikealfare
Yes I realised this when I checked over the dev init and it failed the mypy checks. It's weird, so I was initially testing this in a poetry managed env using the poetry run pip install -e /path
and things were working with just changing those two lines. Will dig a bit further!
Is there any plans to fix this?
hi @affsantos thanks for the bump. @mikealfare was able to create a test case that demonstrates the regression, but we haven't had time yet to iron it out.
our current plan is to dedicate capacity for fixing it in a current sprint. in the meantime, you're welcome to branch off of #872 to get the test passing. maybe @github-christophe-oudar and @RGreen90 have some ideas?
@dataders I feel like this regression has been possibly been introduced somewhere in core. As previously mentioned in the thread, these two lines, previously inherited from a query header object, but now take their data from the profile object.
So whatever the _labels_from_query_comment function is trying to do is no longer receiving correctly formatted data?
But just a thought...
I ran into that bug few weeks ago and I'm actually thinking about pushing information in the query-comment with a macro and parsing the output instead of labels so far. It's no convenient but it's a workaround.
Regarding actually fixing that issue, from what I saw, if you're using a jinja macro in the query-comment, it won't be evaluated resulting in the weird values ( '{{ bq_labels() }}'
becoming query_comment:___bq_labels_____
instead of a list).
I guess there was some code that was doing macro evaluation in the string and then splitting that values... but I don't know where it is (was?). I didn't really looked into it as I'm busy planning "duck" stuff 🦆
Hi,
Based on the previous investigations on this issue and by exploring the code of dbt-core
and dbt-bigquery
, I might have found a fix with #955 (I basically reverted the 2 lines previously identified and added checks to make sure the value is not null).
@mikealfare Given that I copy-pasted your test highlighting the regression with a minor modification (the labels should be a valid JSON so I removed the trailing comma), I included you in the Changie entry.
Let me know if you think this solution isn't valid.
Is this a new bug in dbt-bigquery?
Current Behavior
Hello!
I have a macro that reads a list of environment variables and builds a dictionary that later is being caught by query-comment.job-label feature.
macros/bq_labels.sql:
dbt_project.yml:
It works perfectly when using dbt-bigquery==1.5.3. When I run the bq show command for my job, I can see the following output in the "Labels" column:
But when using 1.6.0, it looks like it can't "parse" the macro. The output is the following:
Expected Behavior
The labels are parsed correctly, like in 1.5.3
Steps To Reproduce
Relevant log output
No response
Environment
Additional Context
No response