FasterXML / jackson-dataformats-text

Uber-project for (some) standard Jackson textual format backends: csv, properties, yaml (xml to be added in future)
Apache License 2.0
405 stars 145 forks source link

(csv) Add `CsvGenerator.Feature` to fail if linefeeds included in quoted content #11

Open cowtowncoder opened 7 years ago

cowtowncoder commented 7 years ago

(from https://github.com/FasterXML/jackson-dataformat-csv/issues/118 by @balbusm)

CsvSchema schema = CsvSchema.emptySchema().withHeader();
FileReader fileReader = new FileReader(file);
MappingIterator<Map<String,String>> it = mapper.readerFor(Map.class)
  .with(schema)
  .readValues(fileReader);
while (it.hasNext()) {
  try {
    Map<String,String> rowAsMap = it.next();
    System.out.println(rowAsMap);
  } catch (Exception e) {
    System.out.println(e);
  }
}

Example input:

1,Head2,Head3,Head4,"Head5
2,Head2,Head3,Head4,Head5
3,Head2,Head3,Head4,Head5
cowtowncoder commented 7 years ago

Hmmh. Unfortunately I am not sure this is possible: since linefeeds are acceptable within quoted content, the last value would enclose lines 2 and 3. So although recovery should be able to recover from missing end quote, that would not help a lot given that the one logical line that exists would be skipped.

One possibility here would be addition of the CSV read feature, or possible a CsvSchema property, which would enable/disable ability to contain linefeeds within quoted content.

Another question would be that of what to do with linefeeds: convert to spaces, drop, or throw exception. I assume first (spaces) or last (exception) are most likely actions.