Open klmr opened 1 year ago
Would you mind making this a self-contained reprex?
Apologies, I’ve no idea why I didn’t initially post it as one.
writeLines(
c('A,B,C,D', 'foo,bar,"ba', 'z,bat'),
'x.csv'
)
vroom::vroom('x.csv', show_col_types = FALSE)
#> # A tibble: 0 × 4
#> # ℹ 4 variables: A <chr>, B <chr>, C <chr>, D <chr>
data.table::fread('x.csv')
#> Warning in data.table::fread("x.csv"): Detected 4 column names but the data has
#> 3 columns. Filling rows automatically. Set fill=TRUE explicitly to avoid this
#> warning.
#> A B C D
#> 1: foo bar "ba\nz,bat NA
Thanks! A little simple/focussed on the specific problem:
vroom::vroom(
I(c('A,B,C', 'd,e,"f', 'g,h,i')),
quote = '"',
show_col_types = FALSE
)
#> # A tibble: 0 × 3
#> # ℹ 3 variables: A <chr>, B <chr>, C <chr>
Created on 2023-08-02 with reprex v2.0.2
See the blog post and the discussion it triggered.
Reprex:
Of course CSV encompasses various formats but even so it’s not clear why ‘vroom’ thinks that the file is valid but has no data rows (despite having more than one line). I therefore guess this is unintentional (I can't think of a situation where silently dropping rows would be the expected behaviour).
From a user perspective, there are two likely scenarios:
quote = ""
should have been passed.(2) is a user error and can thus be ignored here (in fact, passing
quote = ""
leads to warnings on the above file, which is expected). (1) should ideally generate a warning or even an error.