ruby / csv

CSV Reading and Writing
https://ruby.github.io/csv/
BSD 2-Clause "Simplified" License
178 stars 113 forks source link

Improve error reporting for liberal parsing with quoted values #231

Closed NikolayRys closed 2 years ago

NikolayRys commented 2 years ago

In cases when liberal_parsing is used with quoted values in a file that has mixed up EOL terminators, this lib produces a meaningless error:

# Normal parsing, no quotes (makes sense)
CSV.parse("A\nB\r\n")
=> CSV::MalformedCSVError: Unquoted fields do not allow new line <"\r\n"> in line 2.
# Normal parsing, with quotes (kinda makes sense - not super-helpful, but technically correct)
CSV.parse("A\n\"B\"\r\n")
CSV::MalformedCSVError: Any value after quoted field isn't allowed in line 2.
# Liberal parsing, no quotes ( makes sense)
CSV.parse("A\nB\r\n", liberal_parsing: true)
CSV::MalformedCSVError: Unquoted fields do not allow new line <"\r\n"> in line 2.
# Liberal parsing, with quotes (misleading)
CSV.parse("A\n\"B\"\r\n", liberal_parsing: true)
CSV::MalformedCSVError: Any value after quoted field isn't allowed in line 2.

The last one is highly misleading because liberal parsing does indeed allow values after a quoted field, that is its whole purpose. In addition, there is no information about the actual source of the problem - the wrong newline sequence. I personally wasted a lot of time debugging this issue, when received a file with mixed up newline sequences.

This PR adds a separate error message for this case

NikolayRys commented 2 years ago

I will check the failing tests soon

NikolayRys commented 2 years ago

Please re-run the tests

NikolayRys commented 2 years ago

The feedback has been addressed 👍

kou commented 2 years ago

Thanks!