FasterXML / jackson-dataformats-text

Uber-project for (some) standard Jackson textual format backends: csv, properties, yaml (xml to be added in future)
Apache License 2.0
404 stars 148 forks source link

CSV: FAIL_ON_MISSING_COLUMNS does not work as expected when combined with withColumnReordering(true) #273

Open ivangreene opened 3 years ago

ivangreene commented 3 years ago

The CSV feature CsvParser.Feature.FAIL_ON_MISSING_COLUMNS does not work as expected when using a typed CsvSchema with withColumnReordering(true). If a column appears in the POJO, but not in the file, no failure occurs. This seems to be due to the withColumnReordering(true) completely rebuilding the columns, but the FAIL_ON_MISSING_COLUMNS check only occurs during reading of individual rows of data, and it only checks against the rebuilt columns (from the file header, not against the POJO's columns).

Relevant lines:

WellingR commented 3 years ago

I ran into this issue today. In my case reorderColumns was false. In this case no validation is done, and the parsing only fails if there actually is data. Which means that a one-line csv file is considered empty (but valid) even if the header is not valid at all.

I would be good if the existence of the headers is validated when CsvParser.Feature.FAIL_ON_MISSING_COLUMNS is enabled (with and without column reordering).

cowtowncoder commented 3 years ago

What would help here would be a test case to show the details -- I think I understand @WellingR's case but that seems different from @ivangreene 's original case. It might also be necessary to file a different issue for @WellingR's case if that is only related to "header only" content; with something that would constitute invalid header (I also don't know what exactly that part means come to think of it).