kentcb / KBCsv

KBCsv is an efficient, easy to use .NET parsing and writing library for the CSV (comma-separated values) format.
MIT License
78 stars 25 forks source link

Unexpected behavior when setting CsvWriter.NewLine to something other than new line #16

Open ChaosBladeCoder opened 8 years ago

ChaosBladeCoder commented 8 years ago

There are some unexpected behaviors when setting CsvWriter.NewLine to something "crazy".

While this value is (obviously) meant to be either \r\n or \n to support different OS new line conventions, since it is a string, it can be set to any value, for example, || (two pipes). Many tools also support both reading and writing such a file (like SQL Server bcp.exe or SSIS; Excel however seems not to).

When I set CsvWriter.NewLine to be double pipes, two unexpected things happened:

It seems like the grammar rules from rfc4180 have been generalised from "comma" and "double quote" to "any character" for the ValueSeparator and ValueDelimiter, but not for NewLine (which could be thought of as "Row delimiter").

kentcb commented 8 years ago

Hmm, the problem is that the parser itself assumes CR, LF, or CRLF line breaks. The RFC only allows for CRLF.

I think adding support for any newline string would be non-trivial and possibly not worth the effort. Leaving it open for now so I can think about it more.