Closed bcalco closed 10 months ago
I understand the request, but it's not reasonable to support. If your data is malformed, then that's the problem you should fix. It being malformed makes it impossible for xsv
to choose a correct interpretation in every case and exposing options to control how different classes of malformed data are interpreted is not something I'm interested in doing.
although allowing processing of the data
This is the goal that xsv
has.
The CSV parser author told me the same thing. lol.
The issue is, I don't own the data - I'm consuming third party data. So I have to find a way to scrub it. But I understand your position.
If I run into a case where the changes xsv made render it more broken, or introduced a new error, then I'll file a new ticket.
Thanks for the prompt reply, anyway!
Running the 'input' command on a CSV with malformed quoted strings fixes them enough that they are able to be processed but modifies the data inappropriately.
For example, the following problematic column value in one of our test files:
"Choices "contact us" email address"
Note: the two spaces between "Choices" and "contact us" are in the original data.
Gets changed to:
"Choices contact us"" email address"""
But it should be:
"Choices ""contact us"" email address"
The command being run is:
xsv input <malformed-file> -o <target-file>
This is a very consistent error that, although allowing processing of the data (i.e. conformant parsers now accept the files), it subtly (and unacceptably) changes it in the process.