ClickHouse / dbt-clickhouse

The Clickhouse plugin for dbt (data build tool)
Apache License 2.0
249 stars 111 forks source link

DBT table materialisation does not work properly for CH 24.6.2.17 #335

Open rytis-molecule opened 2 months ago

rytis-molecule commented 2 months ago

Describe the bug

Table creation is using incorrect syntax using dbt 1.8.5 and CH 24.6.2.17 and dbt-clickhouse latest 1.8.1

`{{ config(order_by='_airbyte_extracted_at', engine='MergeTree()', materialized='incremental') }}

SELECT 
    _airbyte_extracted_at,
    JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
    JSONExtract(_airbyte_data, 'target_id', 'String') AS target_id,
    JSONExtract(_airbyte_data, 'publisher_id', 'String') AS publisher_id,
    JSONExtract(_airbyte_data, 'label_id', 'String') AS label_id,
    JSONExtract(_airbyte_data, 'calls', 'Int32') AS calls,
    JSONExtract(_airbyte_data, 'revenue', 'Float64') AS revenue,
    JSONExtract(_airbyte_data, 'converted_calls', 'Int32') AS converted_calls,
    JSONExtract(_airbyte_data, 'total_cost', 'Float64') AS total_cost,
    JSONExtract(_airbyte_data, 'call_length_in_seconds', 'Int32') AS call_length_in_seconds,
    JSONExtract(_airbyte_data, 'rounded_minutes', 'Int32') AS rounded_minutes,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_created_at', 'String')) AS etl_created_at,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_updated_at', 'String')) AS etl_updated_at
FROM 
    airbyte_internal.default_raw__stream_ringba_data`

here is my dbt code snippet Here is what gets generated: create OR replace tabledbt_rytis.ringba_data__dbt_backup`

engine = MergeTree() order by (_airbyte_extracted_at)

                -- end_of_sql
                SETTINGS  replicated_deduplication_window=0

        empty
      as 

(

WITH rw AS ( SELECT _airbyte_extracted_at, JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id, JSONExtract(_airbyte_data, 'target_id', 'String') AS target_id, JSONExtract(_airbyte_data, 'publisher_id', 'String') AS publisher_id, JSONExtract(_airbyte_data, 'label_id', 'String') AS label_id, JSONExtract(_airbyte_data, 'calls', 'Int32') AS calls, JSONExtract(_airbyte_data, 'revenue', 'Float64') AS revenue, JSONExtract(_airbyte_data, 'converted_calls', 'Int32') AS converted_calls, JSONExtract(_airbyte_data, 'total_cost', 'Float64') AS total_cost, JSONExtract(_airbyte_data, 'call_length_in_seconds', 'Int32') AS call_length_in_seconds, JSONExtract(_airbyte_data, 'rounded_minutes', 'Int32') AS rounded_minutes, parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_created_at', 'String')) AS etl_created_at, parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_updated_at', 'String')) AS etl_updated_at FROM airbyte_internal.default_raw__stream_ringba_data ) SELECT _airbyte_extracted_at, campaign_id, target_id, publisher_id, label_id, calls, revenue, converted_calls, total_cost, call_length_in_seconds, rounded_minutes, etl_created_at, etl_updated_at FROM rw ORDER BY _airbyte_extracted_at; )`

I get an error:xpected one of: list of elements, insert element, COLUMNS matcher, COLUMNS, qualified asterisk, compound identifier, identifier, asterisk. (SYNTAX_ERROR)

If I remove the last parenthesis after ; and the one opening, then the statment works correctly.

I guess there is a new syntax where when creating from as you do not need parenthesis anymore.

rytis-molecule commented 2 months ago

`{{ config(order_by='_airbyte_extracted_at', engine='MergeTree()', materialized='table') }}

SELECT 
    _airbyte_extracted_at,
    JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
    JSONExtract(_airbyte_data, 'target_id', 'String') AS target_id,
    JSONExtract(_airbyte_data, 'publisher_id', 'String') AS publisher_id,
    JSONExtract(_airbyte_data, 'label_id', 'String') AS label_id,
    JSONExtract(_airbyte_data, 'calls', 'Int32') AS calls,
    JSONExtract(_airbyte_data, 'revenue', 'Float64') AS revenue,
    JSONExtract(_airbyte_data, 'converted_calls', 'Int32') AS converted_calls,
    JSONExtract(_airbyte_data, 'total_cost', 'Float64') AS total_cost,
    JSONExtract(_airbyte_data, 'call_length_in_seconds', 'Int32') AS call_length_in_seconds,
    JSONExtract(_airbyte_data, 'rounded_minutes', 'Int32') AS rounded_minutes,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_created_at', 'String')) AS etl_created_at,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_updated_at', 'String')) AS etl_updated_at
FROM 
    airbyte_internal.default_raw__stream_ringba_data

`

Attaching the log ` ... 10:36:08.760302 [debug] [Thread-1 (]: dbt_clickhouse adapter: SQL status: OK in 0.31 seconds 10:36:08.761864 [debug] [Thread-1 (]: Writing runtime sql for node "model.inulti.ringba_data" 10:36:08.763094 [debug] [Thread-1 (]: dbt_clickhouse adapter: On model.inulti.ringba_data: / {"app": "dbt", "dbt_version": "1.8.5", "profile_name": "inulti", "target_name": "dev", "node_id": "model.inulti.ringba_data"} /

    insert into `dbt_rytis`.`ringba_data__dbt_backup`
    ()

SELECT 
    _airbyte_extracted_at,
    JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
    JSONExtract(_airbyte_data, 'target_id', 'String') AS target_id,
    JSONExtract(_airbyte_data, 'publisher_id', 'String') AS publisher_id,
    JSONExtract(_airbyte_data, 'label_id', 'String') AS label_id,
    JSONExtract(_airbyte_data, 'calls', 'Int32') AS calls,
    JSONExtract(_airbyte_data, 'revenue', 'Float64') AS revenue,
    JSONExtract(_airbyte_data, 'converted_calls', 'Int32') AS converted_calls,
    JSONExtract(_airbyte_data, 'total_cost', 'Float64') AS total_cost,
    JSONExtract(_airbyte_data, 'call_length_in_seconds', 'Int32') AS call_length_in_seconds,
    JSONExtract(_airbyte_data, 'rounded_minutes', 'Int32') AS rounded_minutes,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_created_at', 'String')) AS etl_created_at,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_updated_at', 'String')) AS etl_updated_at
FROM 
    airbyte_internal.default_raw__stream_ringba_data

... 10:36:09.077617 [debug] [Thread-1 (]: dbt_clickhouse adapter: Error running SQL: / {"app": "dbt", "dbt_version": "1.8.5", "profile_name": "inulti", "target_name": "dev", "node_id": "model.inulti.ringba_data"} /

    insert into `dbt_rytis`.`ringba_data__dbt_backup`
    ()

SELECT 
    _airbyte_extracted_at,
    JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
    JSONExtract(_airbyte_data, 'target_id', 'String') AS target_id,
    JSONExtract(_airbyte_data, 'publisher_id', 'String') AS publisher_id,
    JSONExtract(_airbyte_data, 'label_id', 'String') AS label_id,
    JSONExtract(_airbyte_data, 'calls', 'Int32') AS calls,
    JSONExtract(_airbyte_data, 'revenue', 'Float64') AS revenue,
    JSONExtract(_airbyte_data, 'converted_calls', 'Int32') AS converted_calls,
    JSONExtract(_airbyte_data, 'total_cost', 'Float64') AS total_cost,
    JSONExtract(_airbyte_data, 'call_length_in_seconds', 'Int32') AS call_length_in_seconds,
    JSONExtract(_airbyte_data, 'rounded_minutes', 'Int32') AS rounded_minutes,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_created_at', 'String')) AS etl_created_at,
    parseDateTimeBestEffort(JSONExtract(_airbyte_data, 'etl_updated_at', 'String')) AS etl_updated_at
FROM 
    airbyte_internal.default_raw__stream_ringba_data

10:36:09.084809 [debug] [Thread-1 (]: Database Error in model ringba_data (models/staging/ringba_data.sql) HTTPDriver for http://clickhouse.bi.inulti.internal:80 returned response code 400) Code: 62. DB::Exception: Syntax error: failed at position 228 (')') (line 9, col 10): )

  SELECT 
      _airbyte_extracted_at,
      JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
      JSONExtract(_airbyte_data, 't. Expected one of: list of elements, insert element, COLUMNS matcher, COLUMNS, qualified asterisk, compound identifier, identifier, asterisk. (SYNTAX_ERROR) (version 24.6.2.17 (official build))

compiled Code at target/run/inulti/models/staging/ringba_data.sql 10:36:09.086862 [debug] [Thread-1 (]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '19baeb66-1b91-475e-b3a8-48af5e09c2b7', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10387bca0>]} 10:36:09.087686 [error] [Thread-1 (]: 1 of 1 ERROR creating sql table model dbt_rytis.ringba_data ................ [ERROR in 1.17s] 10:36:09.088405 [debug] [Thread-1 (]: Finished running node model.inulti.ringba_data 10:36:09.090270 [debug] [MainThread]: Connection 'master' was properly closed. 10:36:09.090645 [debug] [MainThread]: Connection 'model.inulti.ringba_data' was left open. 10:36:09.091005 [debug] [MainThread]: On model.inulti.ringba_data: Close 10:36:09.091425 [info ] [MainThread]: 10:36:09.091778 [info ] [MainThread]: Finished running 1 table model in 0 hours 0 minutes and 4.33 seconds (4.33s). 10:36:09.092413 [debug] [MainThread]: Command end result 10:36:09.106956 [info ] [MainThread]: 10:36:09.107233 [info ] [MainThread]: Completed with 1 error and 0 warnings: 10:36:09.107390 [info ] [MainThread]: 10:36:09.107586 [error] [MainThread]: Database Error in model ringba_data (models/staging/ringba_data.sql) HTTPDriver for http://clickhouse.bi.inulti.internal:80 returned response code 400) Code: 62. DB::Exception: Syntax error: failed at position 228 (')') (line 9, col 10): )

  SELECT 
      _airbyte_extracted_at,
      JSONExtract(_airbyte_data, 'campaign_id', 'String') AS campaign_id,
      JSONExtract(_airbyte_data, 't. Expected one of: list of elements, insert element, COLUMNS matcher, COLUMNS, qualified asterisk, compound identifier, identifier, asterisk. (SYNTAX_ERROR) (version 24.6.2.17 (official build))

compiled Code at target/run/inulti/models/staging/ringba_data.sql 10:36:09.107744 [info ] [MainThread]: 10:36:09.107887 [info ] [MainThread]: Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1 10:36:09.109242 [debug] [MainThread]: Resource report: {"command_name": "run", "command_wall_clock_time": 5.2218485, "process_user_time": 1.448955, "process_kernel_time": 0.125629, "process_mem_max_rss": "120274944", "command_success": false, "process_in_blocks": "0", "process_out_blocks": "0"} 10:36:09.109493 [debug] [MainThread]: Command dbt run failed at 10:36:09.109454 after 5.22 seconds 10:36:09.109678 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x102b42c50>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1044fa6b0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10529cbb0>]} 10:36:09.109850 [debug] [MainThread]: Flushing usage events`