Closed roshravoof closed 3 weeks ago
Thanks for reporting this @roshravoof !
I was able to replicate what you described.
This only looks like it affects dbt-bigquery (and not dbt-postgres, dbt-snowflake, etc), so I'm going to transfer this issue to the dbt-bigquery repo.
Create these files:
seeds/mappings.csv
id|alpha
1|A
2|B
3|C
seeds/_seeds.yml
seeds:
- name: mappings
config:
delimiter: '|'
Run these commands:
dbt seed
See this output in dbt 1.6:
$ dbt seed
12:47:00 Running with dbt=1.6.5
12:47:34 Registered adapter: bigquery=1.6.9
12:47:34 Unable to do partial parsing because saved manifest not found. Starting full parse.
12:47:35 Found 1 model, 1 seed, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
12:47:35
12:47:59 Concurrency: 10 threads (target='blue')
12:47:59
12:47:59 1 of 1 START seed file dbt_dbeatty.mappings .................................... [RUN]
12:48:05 1 of 1 OK loaded seed file dbt_dbeatty.mappings ................................ [INSERT 3 in 5.95s]
12:48:05
12:48:05 Finished running 1 seed in 0 hours 0 minutes and 30.38 seconds (30.38s).
12:48:05
12:48:05 Completed successfully
12:48:05
12:48:05 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
See this output in dbt 1.7 and 1.8:
$ dbt seed
12:48:57 Running with dbt=1.7.11
12:48:59 Registered adapter: bigquery=1.7.8
12:48:59 Unable to do partial parsing because saved manifest not found. Starting full parse.
12:49:00 Found 1 model, 1 seed, 0 sources, 0 exposures, 0 metrics, 454 macros, 0 groups, 0 semantic models
12:49:00
12:49:33 Concurrency: 10 threads (target='blue')
12:49:33
12:49:33 1 of 1 START seed file dbt_dbeatty.mappings .................................... [RUN]
12:49:36 1 of 1 ERROR loading seed file dbt_dbeatty.mappings ............................ [ERROR in 3.50s]
12:49:36
12:49:36 Finished running 1 seed in 0 hours 0 minutes and 36.08 seconds (36.08s).
12:49:36
12:49:36 Completed with 1 error and 0 warnings:
12:49:36
12:49:36 Runtime Error in seed mappings (seeds/mappings.csv)
Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 0; errors: 3; max bad: 0; error percent: 0
Error while reading data, error message: CSV table references column position 1, but line contains only 1 columns.; line_number: 2 byte_offset_to_start_of_line: 9 column_index: 1 column_name: "alpha" column_type: STRING
Error while reading data, error message: CSV table references column position 1, but line contains only 1 columns.; line_number: 3 byte_offset_to_start_of_line: 13 column_index: 1 column_name: "alpha" column_type: STRING
Error while reading data, error message: CSV table references column position 1, but line contains only 1 columns.; line_number: 4 byte_offset_to_start_of_line: 17 column_index: 1 column_name: "alpha" column_type: STRING
You are loading data without specifying data format, data will be treated as CSV format by default. If this is not what you mean, please specify data format by --source_format.
12:49:36
12:49:36 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
@dbeatty10 is this still an issue as it works with the latest version. Anything I can do on this issue?
Took a look at this, what's interesting is that without the it seems to work if you do it in dbt_project.yml like:
seeds:
jaffle_shop:
mappings:
config:
delimiter: '|'
Will investigate what/how dbt-bigquery is handling this differently
So after much investigating it's not clear exactly what broke this functionality in 1.7 but I can confirm it works in 1.6. This was already being fixed (see #1122) in the upcoming 1.9 release but we'll look at backporting to 1.8 as well
Is this a new bug in dbt-core?
Current Behavior
dbt seed is not accepting custom/pipe delimiter in the seed configs
Above seed config doesnt work in dbt version 1.7.18
Expected Behavior
dbt seed should accept any custom or multiple delimiters in the seed configs. dbt seed should be able to process comma and pipe delimited files in the same project.
Steps To Reproduce
setup dbt version 1.7.18 and python version 3.11
Setup seed config
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
bigquery
Additional Context
No response