Open Jinjinov opened 2 years ago
ShouldQuote
is only used when writing. DetectDelimiter
is only used when reading. There is something else going on for you.
Can you give an example that shows this? Here is what I have based on your reproduction steps.
void Main()
{
var s = new StringBuilder();
s.Append("Id\tName\r\n");
s.Append("1\tone\r\n");
s.Append("2\t,two\r\n");
s.Append("3\tthree\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "\t",
DetectDelimiter = true,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
csv.GetRecords<Foo>().ToList().Dump();
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}
Here is what I have based on your reproduction steps.
You didn't follow my reproduction steps.
new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "\t",
DetectDelimiter = true,
};
is not the same as new CsvConfiguration(CultureInfo.InvariantCulture){DetectDelimiter = true};
If I use Delimiter = "\t"
then it also works for me. But I don't know the delimiter in advance.
It has been 3 months since I was using this, so I don't remember the problem exactly.
Oops. Just remove the Delimiter = "\t"
line. Same result.
The next release that will be coming out probably today will have delimiter detection fixes. If these don't fix your issue, please post here and I can open this again.
I forgot what the original problem was, but I looked into the source code just now.
The problem is in public static bool ShouldQuote(ShouldQuoteArgs args)
https://github.com/JoshClose/CsvHelper/blob/master/src/CsvHelper/Configuration/ConfigurationFunctions.cs#L103
in line || (config.Delimiter.Length > 0 && args.Field.Contains(config.Delimiter)) // Contains delimiter
Because I used DetectDelimiter = true
, the detected delimiter was \t
, but line 103 does not use the detected delimiter, but the default delimiter config.Delimiter
- that causes a problem when writing the content back.
I don't remember what I was doing, but I guess I was trying to use a method that calls ShouldQuote
.
My guess would be that I was trying to use public CsvWriter(TextWriter writer, CsvConfiguration configuration)
that calls ShouldQuote
in shouldQuote = configuration.ShouldQuote;
So the actual problem comes when writing this - the output TSV is not identical to the input TSV.
I am sorry that I was not specific enough.
@JoshClose I updated the original comment.
Oh... You read the data then use that same config to write and are expecting the delimiter from the detection to be used.
The problem here is that IParserConfiguration
is passed into CsvParser
where the delimiter detection is called and it's readonly, so we need to think if a different way to handle this. CsvReader.Parser.Delimiter
is the updated value. After reading, you could set this value to the configuration before you pass it into the CsvWriter
constructor.
Here is an example.
void Main()
{
var s = new StringBuilder();
s.Append("Id;Name\r\n");
s.Append("1;one\r\n");
List<Foo> records;
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
DetectDelimiter = true,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
csv.Context.RegisterClassMap<FooMap>();
records = csv.GetRecords<Foo>().ToList();
config.Delimiter = csv.Parser.Delimiter;
}
using (var writer = new StringWriter())
using (var csv = new CsvWriter(writer, config))
{
csv.WriteRecords(records);
csv.Flush();
writer.Dump();
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}
private class FooMap : ClassMap<Foo>
{
public FooMap()
{
Map(m => m.Id);
Map(m => m.Name);
}
}
I see. Thank you!
I remember now that I was trying to read a TSV and write it back the same way, without quotes and with automatically detected delimiter, but the generated TSV always had quotes for all the cells with a comma in the middle, even though the original didn't have quotes.
Perhaps reading a TSV and writing it back the same way could be explained in the Readme :)
It actually has nothing to do with a TSV. It will happen with any delimiter. Delimiter detection is the issue.
I believe this is the first time I've heard of someone needing to write a file in the same way it was ready. It's usually the other way around.
I'll mark this as documentation.
Describe the bug
ConfigurationFunctions.ShouldQuote
should not check forCsvConfiguration.Delimiter
whenCsvConfiguration.DetectDelimiter
istrue
To Reproduce Steps to reproduce the behavior:
CsvConfiguration config = new CsvConfiguration(CultureInfo.InvariantCulture){DetectDelimiter = true};
using (CsvReader csv = new CsvReader(streamReader, config))
CsvWriter
to write the content back to a TSV,ConfigurationFunctions.ShouldQuote
doesn't check for the detected delimiter\t
but the default,
Expected behavior
ConfigurationFunctions.ShouldQuote
should check for the detected delimiter\t
instead of the default,
so that the TSV generated withCsvWriter
is identical to the original TSV read withCsvReader
.