Open techncl opened 3 weeks ago
apache-commons
or univocity
springs to mind
Does that not only happen on the initial quick check of the number of rows to process? I was pretty sure that in the actual parsing and validation line breaks within fields are handled correctly, but you do sometimes see a mismatch in the number of lines it says it has to process v what it actually does. For example you might find that it says there are a 1000 rows to process in the CSV file, but actually it finishes saying 998 of 1000 rows processed because two rows had a line break within a field.
Yes, I think CSV Validator can handle carriage returns within cells, it just would benefit from better reporting of line counts/numbers as David says. I think it uses Univocity already CSV processing
We are treating all carriage returns/new lines characters (
\r
or\n
) as the end of a row, even if they are in a cell of a row; could we use a CSV library that handles carriage returns/new line characters and/or properly quotes the cells for us?