Open rmgpanw opened 1 year ago
I experienced the same issue with read_csv()
. A single double quote preceding one of the values omitted the remaining records. Only the warning from fread()
saved me:
Here's a somewhat more minimal reprex:
library(readr)
lines <- 'code,description
1,"x
2,-
3,y"'
path <- tempfile()
writeLines(lines, path)
# Allows quoted string to span multiple lines
read_csv(path, col_types = list())
#> # A tibble: 1 × 2
#> code description
#> <dbl> <chr>
#> 1 1 "x\n2,-\n3,y"
# OK: explicit quote works
read_csv(path, quote = "", col_types = list())
#> # A tibble: 3 × 2
#> code description
#> <dbl> <chr>
#> 1 1 "\"x"
#> 2 2 "-"
#> 3 3 "y\""
# warns & treats lines as quoted
data.table::fread(path, quote = "\"")
#> Warning in data.table::fread(path, quote = "\""): Found and resolved improper
#> quoting in first 100 rows. If the fields are not quoted (e.g. field separator
#> does not appear within any field), try quote="" to avoid this warning.
#> code description
#> 1: 1 "x
#> 2: 2 -
#> 3: 3 y"
Created on 2023-07-31 with reprex v2.0.2
Hi, I was recently reading a table into R using the readr package but discovered that several rows were missing as some cells included a single double quotation mark.
I managed to resolve this by setting
quote = ""
, however I wonder if theread_tsv()
/related functions could be updated to at least raise a warning if improper quoting is discovered in the data? The data.table package both raises a warning and deals with this automatically, as per the following reprex.Many thanks for considering!