dremio / dbt-dremio

dbt (data build tool) adapter for the Dremio
Apache License 2.0
44 stars 21 forks source link

[Bug]: dbt model "incremental" creates a _tmp table - want that to be created as iceberg table #208

Closed raghuj5222 closed 9 months ago

raghuj5222 commented 1 year ago

Is there an existing issue for this?

Current Behavior

When a dbt model is incremental - dbt creates a _tmp table during processing of the model.

It is not possible to pass the file_format as "iceberg" or set tblproperties -> table='iceberg'. This causes the table to be created as a parquet table and causes an error running the model

dbt tries to read the information_Schema and doesnt find a table in Dremio

step 1. create table "ECS S3"."asset360"."IcebergDesPoc"."staging_pb_volumes_v4__dbt_tmp" as ( select col from tableA)

-- the above creates a parquet rather than iceberg. can we pass config to define it as a iceberg table

step 2. select column_name as column_name ,lower(data_type) as data_type ,character_maximum_length ,numeric_precision ,numeric_scale from information_schema.columns where ilike(table_schema, 'ECS S3.asset360.IcebergDesPoc') and ilike(table_name, 'staging_pb_volumes_v4__dbt_tmp') order by ordinal_position

  1. insert into "ECS S3"."asset360"."IcebergDesPoc"."staging_pb_volumes_v4"( ) select from "ECS S3"."asset360"."IcebergDesPoc"."staging_pb_volumes_v4__dbt_tmp"

It expects a table - doesnt find it causing an error

Error:

ERROR: Failure parsing the query. 22:44:38.527134 [debug] [Thread-1 ]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '79961fbc-7f5a-4af4-b8c1-8c9834a81f79', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000002C5EE9DB040>]} 22:44:38.529159 [error] [Thread-1 ]: 1 of 1 ERROR creating sql incremental model ECS S3.asset360.IcebergDesPoc.staging_pb_volumes_v4 [ERROR in 486.59s] 22:44:38.530126 [debug] [Thread-1 ]: Finished running node model.naturalgasassets.staging_pb_volumes_v4

Expected Behavior

No response

Steps To Reproduce

{{ config( materialized='incremental', tblproperties= { 'table_type' :'ICEBERG' } ) }}

tblproperties doesnt work

Environment

- OS: Linux
- dbt-dremio:   dbt-dremio
- Dremio Software:23

Relevant log output

insert into "ECS S3"."asset360"."IcebergDesPoc"."staging_pb_volumes_v4"(  )
    select  from "ECS S3"."asset360"."IcebergDesPoc"."staging_pb_volumes_v4__dbt_tmp"

22:44:38.511176 [debug] [Thread-1  ]: On model.naturalgasassets.staging_pb_volumes_v4: ROLLBACK
22:44:38.512173 [debug] [Thread-1  ]: dremio adapter: Handle rollback not implemented.
22:44:38.513182 [debug] [Thread-1  ]: On model.naturalgasassets.staging_pb_volumes_v4: Close
22:44:38.515165 [debug] [Thread-1  ]: Timing info for model.naturalgasassets.staging_pb_volumes_v4 (execute): 22:36:31.986148 => 22:44:38.514168
22:44:38.526138 [debug] [Thread-1  ]: Runtime Error in model staging_pb_volumes_v4 (models\staging_pb_volumes_v4.sql)
  ERROR: Failure parsing the query.
22:44:38.527134 [debug] [Thread-1  ]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '79961fbc-7f5a-4af4-b8c1-8c9834a81f79', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000002C5EE9DB040>]}
22:44:38.529159 [error] [Thread-1  ]: 1 of 1 ERROR creating sql incremental model ECS S3.asset360.IcebergDesPoc.staging_pb_volumes_v4  [ERROR in 486.59s]
22:44:38.530126 [debug] [Thread-1  ]: Finished running node model.naturalgasassets.staging_pb_volumes_v4
22:44:38.533117 [debug] [MainThread]: On master: ROLLBACK
22:44:38.533117 [debug] [MainThread]: Opening a new connection, currently in state init
22:44:38.812209 [debug] [MainThread]: dremio adapter: Connected to db: Asset360
fabrice-etanchaud commented 1 year ago

Hi ! I don't remember dbt-dremio providing option for tmp table creation. Couldn't this be related to :

https://docs.dremio.com/current/reference/sql/commands/apache-iceberg-tables/apache-iceberg-create/

?

Could you please enable the option and retry ?

Best regards from French west coast. Fabrice