Open seralf opened 7 years ago
Hi the CSVProcessor assumes a different enconding than UTF-8 when reading cells: CSVProcessor.java#L72
here is a snippet:
for (String header : reader.getHeaders()) { row.put(new String(header.getBytes("iso8859-1"), UTF_8), reader.get(header)); }
I suggest to read the bytes by default in UTF-8 instead, and add a property "encoding" with some default (for example again "UTF-8"), as suggested b the CSVW vocabulary itself: https://www.w3.org/ns/csvw#encoding
(the problem it's someway similar to this https://github.com/RMLio/RML-LogicalSourceHandler/issues/1)
Hi the CSVProcessor assumes a different enconding than UTF-8 when reading cells: CSVProcessor.java#L72
here is a snippet:
I suggest to read the bytes by default in UTF-8 instead, and add a property "encoding" with some default (for example again "UTF-8"), as suggested b the CSVW vocabulary itself: https://www.w3.org/ns/csvw#encoding