databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
228 stars 119 forks source link

Runtime error with contracts when specifying struct as data_type #720

Open ferdyh opened 4 months ago

ferdyh commented 4 months ago

Describe the bug

When I specify a column with data_type struct when using model contracts, then i get a runtime error.

Steps To Reproduce

  1. Created a model that has a struct column.
  2. Added a .yml file for the model, and added contracting
  3. Set datatype to struct
  4. dbt build that model

Expected behavior

Expected to accept the struct datatype without giving errors.

Screenshots and log output

10:13:42 Runtime Error in model [model_name] (models\[model_name].sql) struct (of class java.lang.String)

System information

The output of dbt --version:

Core:
  - installed: 1.8.3
  - latest:    1.8.3 - Up to date!

Plugins:
  - databricks: 1.8.3 - Up to date!
  - spark:      1.8.0 - Up to date!

Also broken in 1.7.13 and 1.7.17

The operating system you're using: Windows 11

The output of python --version: Python 3.11.9

ferdyh commented 4 months ago

It looks like the query generates an cast(null as struct) as [column_name] whcih is invalid. WHen i define the whole struct type, it works.

benc-db commented 4 months ago

Thanks for the report. If you'd like to work on a PR to address, I'd be happy to help; otherwise due to resource constraints, this bug will probably not be fixed until the next time we work on improving column-specific feature support, hopefully later this year.

TBoris commented 4 months ago

There is an existing implementation for BigQuery, it'll be nice if nested columns will look the same for a different providers.