Open jdbodyfelt opened 1 year ago
A CSV file that has UTF-8 encoding is seeded with dbt seed. Upon review of the load, the column encoding has appeared to change.
dbt seed
Create a CSV with non-standard non-Roman UTF-8 characters (Arabic, Greek, etc.) and try seeding it.
I expect a CSV seeds exactly what is inside of it, ESPECIALLY strings.
CSV: Injection Result:
The output of dbt --version:
dbt --version
Core: - installed: 1.4.6 Plugins: - databricks: 1.4.3
The operating system you're using: Ubuntu 22.04.1 LTS
The output of python --version: Python 3.10.6
python --version
It would be great to have a seeds configuration option for column encoding, e.g.
seeds: - name: <tableName> config: columns: - name: <columeName> dtype: <columnDatatype> encoding: <columnEncoding if STRING or VARCHAR>
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.
Describe the bug
A CSV file that has UTF-8 encoding is seeded with
dbt seed
. Upon review of the load, the column encoding has appeared to change.Steps To Reproduce
Create a CSV with non-standard non-Roman UTF-8 characters (Arabic, Greek, etc.) and try seeding it.
Expected behavior
I expect a CSV seeds exactly what is inside of it, ESPECIALLY strings.
Screenshots and log output
CSV: Injection Result:
System information
The output of
dbt --version
:The operating system you're using:
Ubuntu 22.04.1 LTS
The output of
python --version
: Python 3.10.6Additional context
It would be great to have a seeds configuration option for column encoding, e.g.