When executing dbt seed, the column types specified on the seed properties are not properly applied, they are set to string instead. Also it gives an error where it says it can infer type if a column is empty, even when specified.
Steps To Reproduce
Execute dbt seed with the column types specified on the .yml properties
Expected behavior
Specified properties should be applied to the columns.
Actual behaviour
String type is applied to columns
System information
The output of dbt --version:
Core:
- installed: 1.8.7
- latest: 1.8.7 - Up to date!
Plugins:
- spark: 1.8.0 - Up to date!
The output of python --version:
Python 3.10.12
Additional context
Looks like the issue is that when a column_type is specified in core is casted as a string so the actual casting is handled by the adapter. Also when the csv data is passed via the executed statement, the spark dataframe is created with no schema, causing it to fail if the column is empty, also some datatypes such as dates are lost on this conversion.
Describe the bug
When executing dbt seed, the column types specified on the seed properties are not properly applied, they are set to
string
instead. Also it gives an error where it says it can infer type if a column is empty, even when specified.Steps To Reproduce
Execute dbt seed with the column types specified on the .yml properties
Expected behavior
Specified properties should be applied to the columns.
Actual behaviour
String type is applied to columns
System information
The output of
dbt --version
:The output of
python --version
:Additional context
Looks like the issue is that when a column_type is specified in core is casted as a
string
so the actual casting is handled by the adapter. Also when the csv data is passed via the executed statement, the spark dataframe is created with no schema, causing it to fail if the column is empty, also some datatypes such as dates are lost on this conversion.