MikeStall / DataTable

Class library for working with tabular data, especially CSV files
Apache License 2.0
150 stars 63 forks source link

Auto Escape All Values #38

Closed VagyokC4 closed 9 years ago

VagyokC4 commented 9 years ago

Ensures output CSV can be read back in without any issue.

MikeStall commented 9 years ago

My one concern here is that this pretty invasive - it will quote all serialization and break existing clients. ABC becomes "ABC".
The current logic only quotes if it includes a comma: AB,C becomes "AB,C". It misses the case of embedded quotes as you noted. Can you have it only escape the quotes if the value includes a quote? That greatly reduces the chance of breaking an existing customer.

VagyokC4 commented 9 years ago

Since this is CSV (comma separated values) wouldn't the comma in [AB,C] naturally mean to resolve to cols [AB],[C]. The quotes are what tells the CSV this is all in one column "AB,C" -> [AB,C].

With that said, does this logic actually break existing customers? In my experience, quoting every column has never introduced a break for me, in fact just the opposite, quoting every column has ensured readability (i.e. columns end up where they are supposed to) especially when it comes to other readers opening the file. i.e. Excel / Calc / Google Docs Sheets, etc.

VagyokC4 commented 9 years ago

I see that you accepted my merge request for escaping column data, but your source is still not Quoting all columns. Maybe my merge branch get out of sync because of my other pull request and/or my lack of git experience. I will create another pull request.

I'm seeing in your branch:

        private static string Escape(string s)
        {
            if (s == null)
                return string.Empty;

            if (s.IndexOf(',') >= 0)
            {
                return "\"" + s + "\"";
            }
            return s;
        }

I have in my branch:

        private static string Escape(string s)
        {
            if (s == null)
                return string.Empty;

            return string.Concat("\"", s.Replace("\"", "\"\""), "\"");
        }