Closed JordanMarr closed 11 years ago
Shouldn't the value be "Filters: ""528924"""
in the CSV file? I might be confused on what the field value is here.
Yeah, looks like the log file is wrong to me.
So the consensus seems to be that the value should have been written out to the log as "Filters: ""528924""".
The question then becomes, should the onus be on the consumer of the CsvHelper library to add the extra double quotes, or should it be on the CsvHelper to detect that the value contains inner quotes and then ensure that they are formatted correctly?
My preference is that CsvHelper handles this transparently on behalf of the user, since it is a CSV specific implementation detail. If not by default, then maybe there could at least be a configuration option to take care of this automatically for the user.
Ah - it wasn't clear that the log file was written with CsvHelper. Yes, I'd expect that to be properly encoded when written...
Was the log file written with CsvHelper? If so, this looks like it would be a bug in the writer.
The reader is reading it properly. Try opening it up in Excel and you should see the same results. But if the writer generated that, there is a problem.
Yes, it was. I will see if I can reproduce it in a unit test.
Great, thanks. That would be very helpful.
I was able to recreate the problem in this unit test. Note that this is 1.12.1.41166. I can run it again using the latest version as well and then post results.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using MbUnit.Framework;
using System.IO;
namespace CsvHelperTests
{
[TestFixture]
public class CsvHelperTests
{
[Test]
public void ShouldPreserveInnerDoubleQuotes()
{
FileInfo csvFile = new FileInfo("csvfile.csv");
if (csvFile.Exists)
{
csvFile.Delete();
}
var config = PersonCsvMap.CreateConfig();
Person p = new Person
{
FullName = "John Doe",
Nickname = "John \"The Ladies Man\" Doe"
};
// Write CSV
using (var streamWriter = File.AppendText(csvFile.FullName))
using (var csvWriter = new CsvHelper.CsvWriter(streamWriter, config))
{
csvWriter.WriteRecord(p);
}
// Read CSV
var people = new List();
using (var streamReader = new System.IO.StreamReader(csvFile.FullName))
using (var csvReader = new CsvHelper.CsvReader(streamReader, config))
{
foreach (Person person in csvReader.GetRecords())
{
people.Add(person);
}
}
Assert.AreEqual(p.Nickname, people[0].Nickname);
}
}
public class Person
{
public string FullName { get; set; }
public string Nickname { get; set; }
}
public class PersonCsvMap : CsvHelper.Configuration.CsvClassMap
{
public PersonCsvMap()
{
Map(m => m.FullName);
Map(m => m.Nickname);
}
public static CsvHelper.Configuration.CsvConfiguration CreateConfig()
{
var config = new CsvHelper.Configuration.CsvConfiguration();
config.ClassMapping();
config.QuoteAllFields = true;
return config;
}
}
}
Awesome. I will try and get this fixed today and get a new NuGet version out.
I updated the test to use v1.14.0.39378 (added a call to WriteHeader
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using MbUnit.Framework;
using System.IO;
namespace CsvHelperTests
{
[TestFixture]
public class CsvHelperTests
{
[Test]
public void ShouldPreserveInnerDoubleQuotes()
{
FileInfo csvFile = new FileInfo("csvfile.csv");
if (csvFile.Exists)
{
csvFile.Delete();
}
var config = PersonCsvMap.CreateConfig();
Person p = new Person
{
FullName = "John Doe",
Nickname = "John \"The Ladies Man\" Doe"
};
// Write CSV
using (var streamWriter = File.AppendText(csvFile.FullName))
using (var csvWriter = new CsvHelper.CsvWriter(streamWriter, config))
{
csvWriter.WriteHeader();
csvWriter.WriteRecord(p);
}
// Read CSV
var people = new List();
using (var streamReader = new System.IO.StreamReader(csvFile.FullName))
using (var csvReader = new CsvHelper.CsvReader(streamReader, config))
{
foreach (Person person in csvReader.GetRecords())
{
people.Add(person);
}
}
Assert.AreEqual(p.Nickname, people[0].Nickname);
}
}
public class Person
{
public string FullName { get; set; }
public string Nickname { get; set; }
}
public class PersonCsvMap : CsvHelper.Configuration.CsvClassMap
{
public PersonCsvMap()
{
Map(m => m.FullName);
Map(m => m.Nickname);
}
public static CsvHelper.Configuration.CsvConfiguration CreateConfig()
{
var config = new CsvHelper.Configuration.CsvConfiguration();
config.HasHeaderRecord = true;
config.ClassMapping();
config.QuoteAllFields = true;
return config;
}
}
}
This issue also interests me.
I've noticed that different CSV parsers handle quotes differently. Some parsers use quotes as text delimiters, to allow for commas in the data. Such as the parser used by Microsoft Excel for example.
I have not tested this program yet,i've just recently began watching.
I'm curious to see how the program handles commas in the data. How does the program determine the difference between a data comma, and a delimiting comma.
On Sat, Mar 2, 2013 at 12:31 PM, Jordan Marr notifications@github.comwrote:
I updated the test to use v1.14.0.39378 (added a call to WriteHeader() and modified config file to HasHeaderRecord = true). It still produces the bug. Here is the updated test:
using System; using System.Collections.Generic; using System.Linq; using System.Text; using MbUnit.Framework; using System.IO;
namespace CsvHelperTests { [TestFixture] public class CsvHelperTests { [Test] public void ShouldPreserveInnerDoubleQuotes() { FileInfo csvFile = new FileInfo("csvfile.csv"); if (csvFile.Exists) { csvFile.Delete(); }
var config = PersonCsvMap.CreateConfig(); Person p = new Person { FullName = "John Doe", Nickname = "John \"The Ladies Man\" Doe" }; // Write CSV using (var streamWriter = File.AppendText(csvFile.FullName)) using (var csvWriter = new CsvHelper.CsvWriter(
streamWriter, config)) { csvWriter.WriteHeader(); csvWriter.WriteRecord(p); }
// Read CSV var people = new List(); using (var streamReader = new System.IO.StreamReader(csvFile.FullName)) using (var csvReader = new CsvHelper.CsvReader(streamReader, config)) { foreach (Person person in csvReader.GetRecords()) { people.Add(person); } } Assert.AreEqual(p.Nickname, people[0].Nickname); } } public class Person { public string FullName { get; set; } public string Nickname { get; set; } } public class PersonCsvMap : CsvHelper.Configuration.CsvClassMap { public PersonCsvMap() { Map(m => m.FullName); Map(m => m.Nickname); } public static CsvHelper.Configuration.CsvConfiguration CreateConfig() { var config = new CsvHelper.Configuration.CsvConfiguration(); config.HasHeaderRecord = true; config.ClassMapping(); config.QuoteAllFields = true; return config; } }
}
—
Reply to this email directly or view it on GitHubhttps://github.com/JoshClose/CsvHelper/issues/114#issuecomment-14332794 .
It follows RFC 4180 http://tools.ietf.org/html/rfc4180. Excel also does some extra things to parse malformed CSV files. I went through and made CsvHelper parse malformed CSVs like Excel does, so they should work the same.
I have some unit tests around this specifically. https://github.com/JoshClose/CsvHelper/blob/master/src/CsvHelper.Tests/CsvParserExcelCompatibleTests.cs
To be clear, this issue is now about the writer not the reader? Could do with having the title changed :smile:
Agreed! The title has been updated from reader to writer.
Fix committed to source. https://github.com/JoshClose/CsvHelper/commit/66fbc9c329cb93ab3e7942c9da82900dfb7ca1a7
Version 1.15.0 is on NuGet.
Excellent - thanks Josh, we just ran into this after you pushed the fix! Will update now. Perfect timing :wink:
Good! Sorry it took so long. I (my wife) just had a baby, then I got the flu right after, so I've been out of commission for while. :)
Congrats on the baby! (and the fix looks good on my end)
Thanks! And I'm glad this solved your issue.
I have the following column string value written out correctly in a log file: "Filters: "528294""
When GetRecords rebuilds my T objects, that property has a value of:
"Filters: 528294""
Note the missing double quote before 528294.
This bug exists with version 1.14.0.39378 and I know it also existed in version 1.12.1. I am using the following CsvConfiguration both when reading and writing:
var config = new CsvHelper.Configuration.CsvConfiguration(); config.Delimiter = "\t"; config.ClassMapping();
config.QuoteAllFields = true;