duckdb / dbt-duckdb

dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Apache License 2.0
911 stars 86 forks source link

Prefer pyarrow instead of pandas for python model materialization #93

Closed AlexanderVR closed 1 year ago

AlexanderVR commented 1 year ago

Currently, Python models returning a DuckDBPyRelation are materialized using pandas even if pyarrow is available. Because Pandas dataframes are not strongly typed, DuckDB uses inference for objects to determine the actual types when loading a dataframe. This causes empty VARCHAR cols to be incorrectly materialized with the default (INTEGER) type.

We switch the order to prefer pyarrow when available.