dbt-labs / dbt-project-evaluator

This package contains macros and models to find DAG issues automatically
https://dbt-labs.github.io/dbt-project-evaluator/latest/
Apache License 2.0
408 stars 59 forks source link

Multiline column descriptions result in unclosed string literals #421

Closed katieclaiborne closed 2 months ago

katieclaiborne commented 5 months ago

Describe the bug

In BigQuery, the new base_<node/source>_columns models fail with a syntax error when a multiline column description is present.

Steps to reproduce

Take a single line description, and convert to a multiline description using block notation.

description: >
  My column description

Expected results

I expected dbt project evaluator to run without issue.

Actual results

Instead, the new base_<node/source>_columns models failed with "Unclosed string literal" syntax errors.

Screenshots and log output

Inspecting the failed BigQuery job shows that it is interpreting the multiline string as unclosed individual strings.

image

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_project_evaluator
    version: [">=0.9.0", "<0.10.0"]

Which database are you using dbt with?

The output of dbt --version: I don't have ready access to the CLI at the moment, but these are the versions listed at the start of a Cloud IDE invocation.

dbt=1.6.9, dbt-bigquery=1.6.9

Additional context

I think I've traced the issue to the active line in the wrap_string_with_quotes macro.

It seems like you all are trying to move away from the macro in #419, but I don't know whether dbt.string_literal has a triple-quoting option to support multiline BigQuery strings.

Are you interested in contributing the fix?

Yes! I've played around in #420, and have confirmed that it resolves the issue in our own project.

b-per commented 5 months ago

Thanks for raising it!

We will fix it in a patch in the next day or so. Ideally, it also requires updates to dbt-bigquery which I did a PR for https://github.com/dbt-labs/dbt-bigquery/pull/1089 , but we can include the new bigquery__string_literal in this package for now.

katieclaiborne commented 5 months ago

Sweet! Thanks, Benoit.

kverburg commented 3 months ago

Hey all, any update on this? Any ETA?

b-per commented 3 months ago

Thanks for the follow-up. It slipped my mind.

We have a fix in #422 but it wasn't merged yet as we had some comments about it. If you need the fix ASAP you can install from the branch wrap-strings-quote-update until we get it merged and released.

waligob commented 2 months ago

@b-per FWIW, we upgraded to 0.11.0 this afternoon, and we're still encountering the Syntax error: Unclosed string literal errors in base_node_columns and base_source_columns. Essentially, we're observing the same behavior noted in https://github.com/dbt-labs/dbt-project-evaluator/issues/448.

b-per commented 2 months ago

Hmmm, that's odd. I can't seem to be able to reproduce it...

What do you get if you do a:

dbt show --inline 'select {{ dbt_project_evaluator.wrap_string_with_quotes("ab\nc")}} '

Does it fail as well?

And are there some logs in BQ that you'd be able to share?

waligob commented 2 months ago

Does it fail as well?

It does fail:


$ dbt show --inline 'select {{ dbt_project_evaluator.wrap_string_with_quotes("ab\nc")}} '
19:49:03  Running with dbt=1.7.10
19:49:04  Registered adapter: bigquery=1.7.7
19:49:06  Found 478 models, 5 snapshots, 18 seeds, 294 tests, 179 sources, 1 exposure, 39 metrics, 1738 macros, 0 groups, 7 semantic models
19:49:06  
19:49:07  Concurrency: 4 threads (target='dev')
19:49:07  
19:49:09  BigQuery adapter: https://console.cloud.google.com/bigquery?project=inq-warehouse&j=bq:US:c1f8323b-a1e2-4bf8-9b6f-6d9238bbace3&page=queryresults
19:49:09  Encountered an error:
Runtime Error
  Database Error in sql_operation inline_query (from remote system.sql)
    Syntax error: Unclosed string literal at [8:5]

And here's the corresponding job in BigQuery:

image
b-per commented 2 months ago

Thanks. There is a problem with this macro then, because it works for me locally.

waligob commented 2 months ago

can you do a dbt clean and dbt deps to check if the dbt show command would work?

here's the full output of the three commands:

$ dbt clean && dbt deps && dbt show --inline 'select {{ dbt_project_evaluator.wrap_string_with_quotes("ab\nc")}} '
20:49:07  Running with dbt=1.7.13
20:49:08  Checking /workspaces/inquirer-dbt/logs/*
20:49:08  Cleaned /workspaces/inquirer-dbt/logs/*
20:49:08  Checking /workspaces/inquirer-dbt/dbt_packages/*
20:49:08  Cleaned /workspaces/inquirer-dbt/dbt_packages/*
20:49:08  Checking /workspaces/inquirer-dbt/target-base/*
20:49:08  Cleaned /workspaces/inquirer-dbt/target-base/*
20:49:08  Checking /workspaces/inquirer-dbt/target/*
20:49:08  Cleaned /workspaces/inquirer-dbt/target/*
20:49:08  Finished cleaning all paths.
20:49:10  Running with dbt=1.7.13
20:49:11  Installing dbt-labs/dbt_utils
20:49:11  Installed from version 1.1.1
20:49:11  Up to date!
20:49:11  Installing calogica/dbt_expectations
20:49:11  Installed from version 0.10.3
20:49:11  Up to date!
20:49:11  Installing dbt-labs/codegen
20:49:11  Installed from version 0.12.1
20:49:11  Up to date!
20:49:11  Installing data-mie/dbt_profiler
20:49:11  Installed from version 0.8.1
20:49:11  Up to date!
20:49:11  Installing dbt-labs/dbt_project_evaluator
20:49:11  Installed from version 0.11.0
20:49:11  Up to date!
20:49:11  Installing brooklyn-data/dbt_artifacts
20:49:11  Installed from version 2.6.2
20:49:11  Up to date!
20:49:11  Installing elementary-data/elementary
20:49:12  Installed from version 0.14.1
20:49:12  Up to date!
20:49:12  Installing dbt-labs/audit_helper
20:49:12  Installed from version 0.11.0
20:49:12  Up to date!
20:49:12  Installing grafana/leaner_query
20:49:12  Installed from version 0.1.12
20:49:12  Up to date!
20:49:12  Installing Velir/ga4
20:49:12  Installed from version 6.0.1
20:49:12  Up to date!
20:49:12  Installing calogica/dbt_date
20:49:12  Installed from version 0.10.1
20:49:12  Up to date!
20:49:15  Running with dbt=1.7.13
20:49:16  Registered adapter: bigquery=1.7.7
20:49:16  Unable to do partial parsing because saved manifest not found. Starting full parse.
20:49:37  [WARNING]: Did not find matching node for patch with name 'dim_query_statement' in the 'models' section of file 'models/marts/bigquery_audit_log.yml'
20:49:37  [WARNING]: Did not find matching node for patch with name 'int_dbt_execution_info' in the 'models' section of file 'models/intermediate/int_bigquery_audit_log.yml'
20:49:37  [WARNING]: Did not find matching node for patch with name 'int_importance_calculation' in the 'models' section of file 'models/intermediate/int_bigquery_audit_log.yml'
20:49:38  [WARNING]: Test 'test.leaner_query.not_null_dim_query_statement_job_key.dde1be6c77' (models/marts/bigquery_audit_log.yml) depends on a node named 'dim_query_statement' in package '' which was not found
20:49:40  Found 478 models, 5 snapshots, 294 tests, 18 seeds, 179 sources, 1 exposure, 39 metrics, 1738 macros, 0 groups, 7 semantic models
20:49:40  
20:49:40  Concurrency: 4 threads (target='dev')
20:49:40  
20:49:41  BigQuery adapter: https://console.cloud.google.com/bigquery?project=inq-warehouse&j=bq:US:f0f374a1-2a89-4feb-983a-fbe9facb879a&page=queryresults
20:49:41  Encountered an error:
Runtime Error
  Database Error in sql_operation inline_query (from remote system.sql)
    Syntax error: Unclosed string literal at [8:5]

if you go to dbt_packages/dbt_project_evaluator/macros/wrap_string_with_quotes.sql , what do you see? Is the macro bigquery__string_literal there?

It is. Here's the full contents of the file dbt_packages/dbt_project_evaluator/macros/wrap_string_with_quotes.sql:

{% macro wrap_string_with_quotes(str) %}
  {% if str is none %}
    {{ return('cast(NULL as ' ~ dbt.type_string() ~ ')') }}
  {% else %}
    {{ dbt.string_literal(str) }}
  {% endif %}
{% endmacro %}

{#
  To be removed when https://github.com/dbt-labs/dbt-bigquery/pull/1089 is merged
#}
{% macro bigquery__string_literal(value) -%}
  '''{{ value }}'''
{%- endmacro %}

Did you overwrite the macro bigquery__string_literal(value) with you personal version in your dbt project?

We did not.

b-per commented 2 months ago

We will get to the bottom of it 😄

What I am thinking now then is that one of the other packages you have is definining bigquery__string_literal() themselves.

dispatch:
  - macro_namespace: dbt
    search_order: ['dbt_project_evaluator', 'dbt']
waligob commented 2 months ago

Ok, some progress here! Specifically, the change to the dispatch config resolved the issue. Note that I addressed your first two points before editing the dispatch config:

once all the packages are installed, can you search for bigquery__string_literal to see which package defines it?

I'm only seeing it defined in ./dbt_packages/dbt_project_evaluator/macros/:

$ dbt clean && dbt deps && grep -iR "bigquery__string_literal" ./dbt_packages
16:41:25  Running with dbt=1.7.13
16:41:25  Checking /workspaces/inquirer-dbt/target/*
16:41:25  Cleaned /workspaces/inquirer-dbt/target/*
16:41:25  Checking /workspaces/inquirer-dbt/dbt_packages/*
16:41:26  Cleaned /workspaces/inquirer-dbt/dbt_packages/*
16:41:26  Checking /workspaces/inquirer-dbt/logs/*
16:41:26  Cleaned /workspaces/inquirer-dbt/logs/*
16:41:26  Checking /workspaces/inquirer-dbt/target-base/*
16:41:26  Cleaned /workspaces/inquirer-dbt/target-base/*
16:41:26  Finished cleaning all paths.
16:41:28  Running with dbt=1.7.13
16:41:29  Installing dbt-labs/dbt_utils
16:41:29  Installed from version 1.1.1
16:41:29  Up to date!
16:41:29  Installing calogica/dbt_expectations
16:41:29  Installed from version 0.10.3
16:41:29  Up to date!
16:41:29  Installing dbt-labs/codegen
16:41:29  Installed from version 0.12.1
16:41:29  Up to date!
16:41:29  Installing data-mie/dbt_profiler
16:41:29  Installed from version 0.8.1
16:41:29  Up to date!
16:41:29  Installing dbt-labs/dbt_project_evaluator
16:41:29  Installed from version 0.11.0
16:41:29  Up to date!
16:41:29  Installing brooklyn-data/dbt_artifacts
16:41:29  Installed from version 2.6.2
16:41:29  Up to date!
16:41:29  Installing elementary-data/elementary
16:41:29  Installed from version 0.14.1
16:41:29  Up to date!
16:41:29  Installing dbt-labs/audit_helper
16:41:29  Installed from version 0.11.0
16:41:29  Up to date!
16:41:29  Installing grafana/leaner_query
16:41:30  Installed from version 0.1.12
16:41:30  Up to date!
16:41:30  Installing Velir/ga4
16:41:30  Installed from version 6.0.1
16:41:30  Up to date!
16:41:30  Installing calogica/dbt_date
16:41:30  Installed from version 0.10.1
16:41:30  Up to date!
./dbt_packages/dbt_project_evaluator/macros/wrap_string_with_quotes.sql:{% macro bigquery__string_literal(value) -%}

if you open manifest.json and search for the bigquery__string_literal macro entry, what package is listed?

After running dbt clean && dbt deps && dbt compile, this object contains the only occurrences of bigquery__string_literal in manifest.json

"macro.dbt_project_evaluator.bigquery__string_literal": {
  "name": "bigquery__string_literal",
  "resource_type": "macro",
  "package_name": "dbt_project_evaluator",
  "path": "macros/wrap_string_with_quotes.sql",
  "original_file_path": "macros/wrap_string_with_quotes.sql",
  "unique_id": "macro.dbt_project_evaluator.bigquery__string_literal",
  "macro_sql": "{% macro bigquery__string_literal(value) -%}\n  '''{{ value }}'''\n{%- endmacro %}",
  "depends_on": { "macros": [] },
  "description": "",
  "meta": {},
  "docs": { "show": true, "node_color": null },
  "patch_path": null,
  "arguments": [],
  "created_at": 1714495463.5902495,
  "supported_languages": null
},

could you also try adding the following to your dbt_project.yml?

That resolved the issue!

$ dbt clean && dbt deps && dbt show --inline 'select {{ dbt_project_evaluator.wrap_string_with_quotes("ab\nc")}} '
16:53:21  Running with dbt=1.7.13
16:53:21  Checking /workspaces/inquirer-dbt/target-base/*
16:53:21  Cleaned /workspaces/inquirer-dbt/target-base/*
16:53:21  Checking /workspaces/inquirer-dbt/target/*
16:53:21  Cleaned /workspaces/inquirer-dbt/target/*
16:53:21  Checking /workspaces/inquirer-dbt/dbt_packages/*
16:53:21  Cleaned /workspaces/inquirer-dbt/dbt_packages/*
16:53:21  Checking /workspaces/inquirer-dbt/logs/*
16:53:21  Cleaned /workspaces/inquirer-dbt/logs/*
16:53:21  Finished cleaning all paths.
16:53:24  Running with dbt=1.7.13
16:53:25  Installing dbt-labs/dbt_utils
16:53:25  Installed from version 1.1.1
16:53:25  Up to date!
16:53:25  Installing calogica/dbt_expectations
16:53:25  Installed from version 0.10.3
16:53:25  Up to date!
16:53:25  Installing dbt-labs/codegen
16:53:25  Installed from version 0.12.1
16:53:25  Up to date!
16:53:25  Installing data-mie/dbt_profiler
16:53:25  Installed from version 0.8.1
16:53:25  Up to date!
16:53:25  Installing dbt-labs/dbt_project_evaluator
16:53:25  Installed from version 0.11.0
16:53:25  Up to date!
16:53:25  Installing brooklyn-data/dbt_artifacts
16:53:25  Installed from version 2.6.2
16:53:25  Up to date!
16:53:25  Installing elementary-data/elementary
16:53:25  Installed from version 0.14.1
16:53:25  Up to date!
16:53:25  Installing dbt-labs/audit_helper
16:53:25  Installed from version 0.11.0
16:53:25  Up to date!
16:53:25  Installing grafana/leaner_query
16:53:25  Installed from version 0.1.12
16:53:25  Up to date!
16:53:25  Installing Velir/ga4
16:53:25  Installed from version 6.0.1
16:53:25  Up to date!
16:53:25  Installing calogica/dbt_date
16:53:26  Installed from version 0.10.1
16:53:26  Up to date!
16:53:28  Running with dbt=1.7.13
16:53:30  Registered adapter: bigquery=1.7.7
16:53:30  Unable to do partial parsing because saved manifest not found. Starting full parse.
16:53:50  [WARNING]: Did not find matching node for patch with name 'dim_query_statement' in the 'models' section of file 'models/marts/bigquery_audit_log.yml'
16:53:50  [WARNING]: Did not find matching node for patch with name 'int_dbt_execution_info' in the 'models' section of file 'models/intermediate/int_bigquery_audit_log.yml'
16:53:50  [WARNING]: Did not find matching node for patch with name 'int_importance_calculation' in the 'models' section of file 'models/intermediate/int_bigquery_audit_log.yml'
16:53:50  [WARNING]: Test 'test.leaner_query.not_null_dim_query_statement_job_key.dde1be6c77' (models/marts/bigquery_audit_log.yml) depends on a node named 'dim_query_statement' in package '' which was not found
16:53:52  Found 478 models, 5 snapshots, 294 tests, 18 seeds, 179 sources, 1 exposure, 39 metrics, 1738 macros, 0 groups, 7 semantic models
16:53:53  
16:53:53  Concurrency: 4 threads (target='dev')
16:53:53  
16:53:54  Previewing inline node:
| f0_  |
| ---- |
| ab
c |
b-per commented 2 months ago

I am still puzzled but will take the dispatch: as a win for now!

The bigquery__string_literal macro is getting added to dbt-bigquery for dbt >= 1.8 as well, so maybe the extra config won't be required forever.

waligob commented 2 months ago

Thanks for all the assistance, @b-per. FWIW, there's one additional instance of bigquery__string_literal in my manifest.json after making the change to dispatch (both nested under the macros key):

"macro.dbt.string_literal": {
  "name": "string_literal",
  "resource_type": "macro",
  "package_name": "dbt",
  "path": "macros/utils/literal.sql",
  "original_file_path": "macros/utils/literal.sql",
  "unique_id": "macro.dbt.string_literal",
  "macro_sql": "{%- macro string_literal(value) -%}\n  {{ return(adapter.dispatch('string_literal', 'dbt') (value)) }}\n{%- endmacro -%}\n\n",
  "depends_on": {
    "macros": ["macro.dbt_project_evaluator.bigquery__string_literal"]
  },
  "description": "",
  "meta": {},
  "docs": { "show": true, "node_color": null },
  "patch_path": null,
  "arguments": [],
  "created_at": 1714496011.2609057,
  "supported_languages": null
},
"macro.dbt_project_evaluator.bigquery__string_literal": {
  "name": "bigquery__string_literal",
  "resource_type": "macro",
  "package_name": "dbt_project_evaluator",
  "path": "macros/wrap_string_with_quotes.sql",
  "original_file_path": "macros/wrap_string_with_quotes.sql",
  "unique_id": "macro.dbt_project_evaluator.bigquery__string_literal",
  "macro_sql": "{% macro bigquery__string_literal(value) -%}\n  '''{{ value }}'''\n{%- endmacro %}",
  "depends_on": { "macros": [] },
  "description": "",
  "meta": {},
  "docs": { "show": true, "node_color": null },
  "patch_path": null,
  "arguments": [],
  "created_at": 1714496013.0822887,
  "supported_languages": null
},