Closed VagyokC4 closed 9 years ago
My one concern here is that this pretty invasive - it will quote all serialization and break existing clients. ABC becomes "ABC".
The current logic only quotes if it includes a comma: AB,C becomes "AB,C". It misses the case of embedded quotes as you noted.
Can you have it only escape the quotes if the value includes a quote? That greatly reduces the chance of breaking an existing customer.
Since this is CSV (comma separated values) wouldn't the comma in [AB,C] naturally mean to resolve to cols [AB],[C]. The quotes are what tells the CSV this is all in one column "AB,C" -> [AB,C].
With that said, does this logic actually break existing customers? In my experience, quoting every column has never introduced a break for me, in fact just the opposite, quoting every column has ensured readability (i.e. columns end up where they are supposed to) especially when it comes to other readers opening the file. i.e. Excel / Calc / Google Docs Sheets, etc.
I see that you accepted my merge request for escaping column data, but your source is still not Quoting all columns. Maybe my merge branch get out of sync because of my other pull request and/or my lack of git experience. I will create another pull request.
I'm seeing in your branch:
private static string Escape(string s)
{
if (s == null)
return string.Empty;
if (s.IndexOf(',') >= 0)
{
return "\"" + s + "\"";
}
return s;
}
I have in my branch:
private static string Escape(string s)
{
if (s == null)
return string.Empty;
return string.Concat("\"", s.Replace("\"", "\"\""), "\"");
}
Ensures output CSV can be read back in without any issue.