Open asfimport opened 2 years ago
Nicola Crane / @thisisnic:
Thanks for reporting this [~gregleleu]
. I don't think this is currently supported - I've opened ticket ARROW-16000 to ask for this to be implemented in the C++, so once it has been we should be able to expose this functionality in R.
Nicola Crane / @thisisnic: I'm leaving this ticket as a bug for now as until there is functionality in C++ to allow this, we should provide users with better error messaging than there is at the moment.
Are there any updates on this?
I am trying to read in a csv dataset consisting of multiple files in the ISO-8859-1 encoding, but keep encountering the error "CSV conversion error to string: invalid UTF8 data
", despite setting the encoding with
arrow::open_delim_dataset(
files,
delim = ";",
convert_options = arrow::csv_convert_options(decimal_point = ","),
read_options = arrow::csv_read_options(encoding = "ISO-8859-1")
)
The encoding options are passed when a single file is read with read_delim_arrow, but not when opening a folder with open_dataset.
read_delim_arrow creates a reader using CsvTableReader$create (which is what is tested in the package's tests).
open_dataset creates a factory and I'm unable to follow what happens when $Finish() is called.
Also, the documentation ("CsvReadOptions" page) lists the "encoding" option under "CsvConvertOptions$create()" instead of "CsvReadOptions$create()"
Reporter: Gregoire Leleu
Related issues:
Original Issue Attachments:
Note: This issue was originally created as ARROW-15992. Please see the migration documentation for further details.