databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
211 stars 112 forks source link

Error associated with `persist_docs` and schema definitions inconsistent with actual built models - present in dbt v1.6 but NOT v1.7 and up #763

Open wusanny opened 1 month ago

wusanny commented 1 month ago

Describe the bug

An error message that used to show up in dbt v1.6 (when certain conditions are met), no longer shows up in dbt v1.7 and above.

In dbt v1.6, when persist_docs is enabled, models materialized as table and there are columns defined in the schema.yml file that are not present or have a different column name in the actual models built - the run will fail and an error message will show.

# example of the error message - column 'id_my' is not in the actual built model. column 'id' is

06:23:10 Databricks adapter: diagnostic-info: org.apache.hive.service.cli.HiveSQLException: Error running query: [_LEGACY_ERROR_TEMP_1331] org.apache.spark.sql.AnalysisException: Missing field id_my in table sanny_dev.dbt_sprawira.model_two with schema:
root
 |-- id: integer (nullable = true)

The above error message is no longer present in dbt v1.7 and above.

Steps To Reproduce

  1. Start with dbt v1.6

  2. Add a model file:

    
    -- models/foo.sql

select 1 as id

3. Define the model in schema.yml - _with a different column than what is in the actual model_:

```yml
# models/schema.yml

models:
    - name: foo
      description: this is foo model

      columns:
          - name: wrong_id
            description: this is column id
  1. Add persist_docs in dbt_project.yml and table materialization:
-- myproject/dbt_project.yml

models:
  +persist_docs:
    relation: true
    columns: true
  myproject:
    +materialized: table
  1. Run dbt build -s foo
    23:21:25 Running with dbt=1.6.17
    23:21:25 Registered adapter: databricks=1.6.9
    23:21:25 Found 1 model, 0 sources, 0 exposures, 0 metrics, 476 macros, 0 groups, 0 semantic models
    23:21:25 
    23:21:27 Concurrency: 4 threads (target='default')
    23:21:27 
    23:21:56 
    23:21:56 Finished running 1 table model in 0 hours 0 minutes and 31.16 seconds (31.16s).
    23:21:56 
    23:21:56 Completed with 1 error and 0 warnings:
    23:21:56 
    23:21:56   Runtime Error in model foo (models/foo.sql)
    Missing field wrong_id in table sanny_dev.dbt_sprawira.foo with schema:
    root
    |-- id: integer (nullable = true)
    .; line 4 pos 8
    23:21:56 
    23:21:56 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
  2. Repeat step 1-5, but with dbt v1.7 or above - no changes made, yet no error shows and the run is successful
    23:26:54 Running with dbt=1.7.17
    23:26:54 Registered adapter: databricks=1.7.17
    23:26:54 Found 1 model, 0 sources, 0 exposures, 0 metrics, 538 macros, 0 groups, 0 semantic models
    23:26:55 
    23:26:58 Concurrency: 4 threads (target='default')
    23:26:58 
    23:27:04 
    23:27:04 Finished running 1 table model in 0 hours 0 minutes and 9.60 seconds (9.60s).
    23:27:04 
    23:27:04 Completed successfully
    23:27:04 
    23:27:04 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Expected behavior

Users expect to see the same error message being prompted in dbt version higher than 1.6 so as to made them aware that the columns defined in their schema.yml is inconsistent with what is present in their actual built models.

Screenshots and log output

The one WITH error:

23:41:52 Running with dbt=1.6.17
23:41:52 running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': '/usr/src/dbt-server-shared/.dbt', 'debug': 'False', 'warn_error': 'None', 'log_path': '/usr/src/dbt-server-shared/working_dir/c198cafd-8524-4709-8ebf-e7a00b198bfd', 'fail_fast': 'False', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'invocation_command': 'dbt -A dbt_worker.app worker --loglevel=DEBUG --concurrency=2 --max-memory-per-child=500000', 'log_format': 'json', 'static_parser': 'True', 'target_path': 'None', 'introspect': 'True', 'send_anonymous_usage_stats': 'True'}
23:41:52 Registered adapter: databricks=1.6.9
23:41:52 checksum: 1bf9222bf0b23116924437e635bca60f1582e16e1f36aa0db69679f4ffb1669c, vars: {}, profile: user, target: , version: 1.6.17
23:41:52 Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
23:41:52 Partial parsing enabled, no changes found, skipping parsing
23:41:52 Found 1 model, 0 sources, 0 exposures, 0 metrics, 476 macros, 0 groups, 0 semantic models
23:41:52 
23:41:52 Acquiring new databricks connection 'master'
23:41:52 Acquiring new databricks connection 'list_sanny_dev'
23:41:52 Using databricks connection "list_sanny_dev"
23:41:52 On list_sanny_dev: GetSchemas(database=sanny_dev, schema=None)
23:41:52 Opening a new connection, currently in state init
23:41:53 SQL status: OK in 0.8700000047683716 seconds
23:41:53 On list_sanny_dev: Close
23:41:53 Re-using an available connection from the pool (formerly list_sanny_dev, now list_sanny_dev_dbt_sprawira)
23:41:53 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:41:53 On list_sanny_dev_dbt_sprawira: GetTables(database=sanny_dev, schema=dbt_sprawira, identifier=None)
23:41:53 Opening a new connection, currently in state closed
23:41:54 SQL status: OK in 0.6499999761581421 seconds
23:41:54 Spark adapter: NotImplemented: add_begin_query
23:41:54 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:41:54 On list_sanny_dev_dbt_sprawira: /* {"app": "dbt", "dbt_version": "1.6.17", "dbt_databricks_version": "1.6.9", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "connection_name": "list_sanny_dev_dbt_sprawira"} */

      select current_catalog()

23:41:54 SQL status: OK in 0.2199999988079071 seconds
23:41:54 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:41:54 On list_sanny_dev_dbt_sprawira: /* {"app": "dbt", "dbt_version": "1.6.17", "dbt_databricks_version": "1.6.9", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "connection_name": "list_sanny_dev_dbt_sprawira"} */
show views in `sanny_dev`.`dbt_sprawira`

23:41:54 SQL status: OK in 0.3700000047683716 seconds
23:41:54 On list_sanny_dev_dbt_sprawira: ROLLBACK
23:41:54 Databricks adapter: NotImplemented: rollback
23:41:54 On list_sanny_dev_dbt_sprawira: Close
23:41:54 Spark adapter: NotImplemented: add_begin_query
23:41:54 Spark adapter: NotImplemented: commit
23:41:54 Concurrency: 4 threads (target='default')
23:41:54 
23:41:54 Re-using an available connection from the pool (formerly list_sanny_dev_dbt_sprawira, now model.my_new_project.foo)
23:41:54 Using databricks connection "model.my_new_project.foo"
23:41:54 On model.my_new_project.foo: /* {"app": "dbt", "dbt_version": "1.6.17", "dbt_databricks_version": "1.6.9", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "node_id": "model.my_new_project.foo"} */

      describe extended `sanny_dev`.`dbt_sprawira`.`foo`

23:41:55 Using databricks connection "model.my_new_project.foo"
23:41:55 On model.my_new_project.foo: /* {"app": "dbt", "dbt_version": "1.6.17", "dbt_databricks_version": "1.6.9", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "node_id": "model.my_new_project.foo"} */

        create or replace table `sanny_dev`.`dbt_sprawira`.`foo`

    using delta
     comment 'this is foo model'
      as
      select 1 as id

23:41:59 Using databricks connection "model.my_new_project.foo"
23:41:59 On model.my_new_project.foo: /* {"app": "dbt", "dbt_version": "1.6.17", "dbt_databricks_version": "1.6.9", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "node_id": "model.my_new_project.foo"} */

        alter table `sanny_dev`.`dbt_sprawira`.`foo` change column
            wrong_id
            comment 'this is column id';

23:42:00 On master: ROLLBACK
23:42:00 Opening a new connection, currently in state init
23:42:00 Databricks adapter: NotImplemented: rollback
23:42:00 Spark adapter: NotImplemented: add_begin_query
23:42:00 Spark adapter: NotImplemented: commit
23:42:00 On master: ROLLBACK
23:42:00 Databricks adapter: NotImplemented: rollback
23:42:00 On master: Close
23:42:00 Connection 'master' was properly closed.
23:42:00 Connection 'model.my_new_project.foo' was properly closed.
23:42:00 
23:42:00 Finished running 1 table model in 0 hours 0 minutes and 7.72 seconds (7.72s).
23:42:00 Command end result
23:42:00 
23:42:00 Completed with 1 error and 0 warnings:
23:42:00 
23:42:00   Runtime Error in model foo (models/foo.sql)
  Missing field wrong_id in table sanny_dev.dbt_sprawira.foo with schema:
  root
   |-- id: integer (nullable = true)
  .; line 4 pos 8
23:42:00 
23:42:00 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
23:42:00 Command `cli build` failed at 23:42:00.487398 after 7.93 seconds

The one WITHOUT error:

23:44:46 Running with dbt=1.7.17
23:44:46 running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'fail_fast': 'False', 'debug': 'False', 'log_path': '/usr/src/dbt-server-shared/working_dir/6eb1dcc5-e938-45bc-9bf5-f479632c63b1', 'profiles_dir': '/usr/src/dbt-server-shared/.dbt', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'introspect': 'True', 'log_format': 'json', 'invocation_command': 'dbt -A dbt_worker.app worker --loglevel=DEBUG --concurrency=2 --max-memory-per-child=500000', 'target_path': 'None', 'static_parser': 'True', 'send_anonymous_usage_stats': 'True'}
23:44:46 Registered adapter: databricks=1.7.17
23:44:46 checksum: 45811e3fdd324bd8a0c43537bc1f01535ae2ada6c84b13970de72bb3ca0a8529, vars: {}, profile: user, target: , version: 1.7.17
23:44:46 Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
23:44:46 Partial parsing enabled, no changes found, skipping parsing
23:44:46 Found 1 model, 0 sources, 0 exposures, 0 metrics, 538 macros, 0 groups, 0 semantic models
23:44:46 
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=0s, acquire-count=0, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Creating connection
23:44:46 Acquiring new databricks connection 'master'
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=3.814697265625e-06s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Acquired connection on thread (60, 140098188846912), using default compute resource
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=None, name=list_sanny_dev, idle-time=0s, acquire-count=0, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Creating connection
23:44:46 Acquiring new databricks connection 'list_sanny_dev'
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=None, name=list_sanny_dev, idle-time=2.6226043701171875e-06s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Acquired connection on thread (60, 140094157604608), using default compute resource
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=None, name=list_sanny_dev, idle-time=0.00041294097900390625s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=None, name=list_sanny_dev, idle-time=0.0007655620574951172s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:46 Using databricks connection "list_sanny_dev"
23:44:46 On list_sanny_dev: GetSchemas(database=sanny_dev, schema=None)
23:44:46 Opening a new connection, currently in state init
23:44:46 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev, idle-time=4.0531158447265625e-06s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Connection created
23:44:46 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=Unknown) - Created cursor
23:44:47 SQL status: OK in 0.6700000166893005 seconds
23:44:47 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=01ef56a9-5c2d-1121-afb9-29d9cc72ae23) - Closing cursor
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev, idle-time=4.76837158203125e-06s, acquire-count=0, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Released connection
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev, idle-time=0.001550912857055664s, acquire-count=0, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:47 Re-using an available connection from the pool (formerly list_sanny_dev, now list_sanny_dev_dbt_sprawira)
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.0024046897888183594s, acquire-count=0, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Reusing connection previously named list_sanny_dev
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.002774953842163086s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Acquired connection on thread (60, 140094157604608), using default compute resource
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.006687641143798828s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.007129192352294922s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:47 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:44:47 On list_sanny_dev_dbt_sprawira: GetTables(database=sanny_dev, schema=dbt_sprawira, identifier=None)
23:44:47 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=Unknown) - Created cursor
23:44:47 SQL status: OK in 0.550000011920929 seconds
23:44:47 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=01ef56a9-5c7e-1a99-8e20-abc09c1a16a7) - Closing cursor
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.5720198154449463s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.5724983215332031s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.5728554725646973s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:47 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.5731868743896484s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:47 Spark adapter: NotImplemented: add_begin_query
23:44:47 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:44:47 On list_sanny_dev_dbt_sprawira: /* {"app": "dbt", "dbt_version": "1.7.17", "dbt_databricks_version": "1.7.17", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "connection_name": "list_sanny_dev_dbt_sprawira"} */

      select current_catalog()

23:44:47 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=Unknown) - Created cursor
23:44:48 SQL status: OK in 0.2199999988079071 seconds
23:44:48 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=01ef56a9-5cda-19bc-af6b-7990cc3cf215) - Closing cursor
23:44:48 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.7985682487487793s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:48 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=0.7990405559539795s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:48 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:44:48 On list_sanny_dev_dbt_sprawira: /* {"app": "dbt", "dbt_version": "1.7.17", "dbt_databricks_version": "1.7.17", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "connection_name": "list_sanny_dev_dbt_sprawira"} */
show views in `sanny_dev`.`dbt_sprawira`

23:44:48 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=Unknown) - Created cursor
23:44:48 SQL status: OK in 0.2800000011920929 seconds
23:44:48 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=01ef56a9-5cf8-1500-a162-2207fa311c5e) - Closing cursor
23:44:48 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=1.0913498401641846s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Checking idleness
23:44:48 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=1.092033863067627s, acquire-count=1, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Retrieving connection
23:44:48 Using databricks connection "list_sanny_dev_dbt_sprawira"
23:44:48 On list_sanny_dev_dbt_sprawira: /* {"app": "dbt", "dbt_version": "1.7.17", "dbt_databricks_version": "1.7.17", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "default", "connection_name": "list_sanny_dev_dbt_sprawira"} */
show table extended in `sanny_dev`.`dbt_sprawira` like '*'

23:44:48 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=Unknown) - Created cursor
23:44:49 SQL status: OK in 1.309999942779541 seconds
23:44:49 Databricks adapter: Cursor(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, command-id=01ef56a9-5d25-11fa-a883-c4c4a2a9a67a) - Closing cursor
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096498051728, session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377, name=list_sanny_dev_dbt_sprawira, idle-time=3.337860107421875e-06s, acquire-count=0, language=None, thread-identifier=(60, 140094157604608), compute-name=) - Released connection
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=3.0946531295776367s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Checking idleness
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=3.095033645629883s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Retrieving connection
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=3.0954012870788574s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Checking idleness
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=3.095738649368286s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Retrieving connection
23:44:49 Spark adapter: NotImplemented: add_begin_query
23:44:49 Spark adapter: NotImplemented: commit
23:44:49 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=2.384185791015625e-06s, acquire-count=0, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Released connection
23:44:49 Concurrency: 4 threads (target='default')
23:44:49 
23:44:49 Re-using an available connection from the pool (formerly list_sanny_dev_dbt_sprawira, now model.my_new_project.foo)
23:44:49 Using databricks connection "model.my_new_project.foo"
23:44:50 Using databricks connection "model.my_new_project.foo"
23:44:53 Using databricks connection "model.my_new_project.foo"
23:44:53 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=4.302577257156372s, acquire-count=0, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Checking idleness
23:44:53 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=4.303051948547363s, acquire-count=0, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Reusing connection previously named master
23:44:53 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=4.303421974182129s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Acquired connection on thread (60, 140098188846912), using default compute resource
23:44:53 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=4.303831577301025s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Checking idleness
23:44:53 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=None, name=master, idle-time=4.304159641265869s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Retrieving connection
23:44:53 On master: ROLLBACK
23:44:53 Opening a new connection, currently in state init
23:44:54 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=01ef56a9-6080-1012-8432-5004d219311a, name=master, idle-time=2.86102294921875e-06s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Connection created
23:44:54 Databricks adapter: NotImplemented: rollback
23:44:54 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=01ef56a9-6080-1012-8432-5004d219311a, name=master, idle-time=0.0009474754333496094s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Checking idleness
23:44:54 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=01ef56a9-6080-1012-8432-5004d219311a, name=master, idle-time=0.0012879371643066406s, acquire-count=1, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Retrieving connection
23:44:54 Spark adapter: NotImplemented: add_begin_query
23:44:54 Spark adapter: NotImplemented: commit
23:44:54 Databricks adapter: DatabricksDBTConnection(id=140096501658896, session-id=01ef56a9-6080-1012-8432-5004d219311a, name=master, idle-time=1.6689300537109375e-06s, acquire-count=0, language=None, thread-identifier=(60, 140098188846912), compute-name=) - Released connection
23:44:54 Connection 'master' was properly closed.
23:44:54 On master: ROLLBACK
23:44:54 Databricks adapter: NotImplemented: rollback
23:44:54 On master: Close
23:44:54 Databricks adapter: Connection(session-id=01ef56a9-6080-1012-8432-5004d219311a) - Closing connection
23:44:54 Connection 'model.my_new_project.foo' was properly closed.
23:44:54 On model.my_new_project.foo: ROLLBACK
23:44:54 Databricks adapter: NotImplemented: rollback
23:44:54 On model.my_new_project.foo: Close
23:44:54 Databricks adapter: Connection(session-id=01ef56a9-5c1f-1aa3-ba39-d308e0740377) - Closing connection
23:44:54 
23:44:54 Finished running 1 table model in 0 hours 0 minutes and 7.56 seconds (7.56s).
23:44:54 Command end result
23:44:54 
23:44:54 Completed successfully
23:44:54 
23:44:54 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
23:44:54 Resource report: {"command_name": "build", "command_success": true, "command_wall_clock_time": 7.6343913, "process_user_time": 4.924903, "process_kernel_time": 3.007684, "process_mem_max_rss": "359596", "process_out_blocks": "5880", "process_in_blocks": "0"}
23:44:54 Command `cli build` succeeded at 23:44:54.164519 after 7.63 seconds

System information

dbt --version: dbt Cloud - tested with dbt version 1.6 and 1.7

The operating system you're using: macOS

The output of python --version:

Additional context

It does not matter whether or not there are column descriptions in schema.yml - as long as persist_docs is enabled and the columns listed in schema.yml does not match the columns found in the built models, the error will show (dbt v1.6).

However, if we change the materialization to view - error does not arise (even in dbt v1.6).

benc-db commented 1 month ago

Thanks for reporting. There is currently a PR for improving column comment materialization, I will check with author if it addresses this issue.

benc-db commented 1 month ago

Ok, #750 didn't fix this particular issue so we'll need to dig deeper into what changed. Thanks for reporting.