Closed jkiller295 closed 1 month ago
I believe you need to set the Encoding
on your StreamReader
. CsvHelper knows nothing about that.
I believe you need to set the
Encoding
on yourStreamReader
. CsvHelper knows nothing about that.
Hi Josh, thanks for the prompt reply. What if the file encoding is not Unicode
, does CSV Helper have a function to convert the file's Encoding (maybe a FR if it does not currently have)?
You would need to figure out the encoding being used and set it on the StreamReader
.
For example, you could try iso-8859-1
.
using (var reader = new StreamReader("Test.csv", Encoding.GetEncoding("iso-8859-1")))
var encodings = Encoding.GetEncodings()
.Select(e => e.GetEncoding())
.Select(e => new
{
Encoding = e,
Preamble = e.GetPreamble()
})
.Where(e => e.Preamble.Any())
.ToArray();
int maxPrembleLength = encodings.Max(e => e.Preamble.Length);
byte[] buffer = new byte[maxPrembleLength];
using (FileStream stream = File.OpenRead(filePath))
{
stream.Read(buffer, 0, (int)Math.Min(maxPrembleLength, stream.Length));
}
return encodings
.Where(enc => enc.Preamble.SequenceEqual(buffer.Take(enc.Preamble.Length)))
.Select(enc => enc.Encoding)
.FirstOrDefault() ?? Encoding.Default;
I found this piece of code that can get the Encoding of a file. Would be nice if CsvHelper has a built-int function to get the CSV's encoding
It looks like there are already libraries that do this. https://github.com/errepi/ude
Try using this:
using (var reader = new StreamReader("Test.csv", new UTF8Encoding(true)))
I had the same problem when writing to a CSV file, so I tried using encoding in the StreamWriter
like this:
string filePath = $"{config.FileName}.csv";
using (var writer = new StreamWriter(filePath, false, new UTF8Encoding(true)))
using (var csv = new CsvWriter(writer, new CsvConfiguration(CultureInfo.InvariantCulture)))
{
// Write header
var header = CsvServices.GenerateCsvHeader(config);
csv.WriteField(header);
csv.NextRecord();
// Write each line of the CSV
foreach (var line in CsvServices.GenerateCsvLines(config))
{
csv.WriteField(line);
csv.NextRecord();
}
}
@babisque That will only work if the input file is unicode encoded. Anw, I'm closing this issue since it's not a bug on CsvHelper side
Describe the bug Given this data
Côte d'Ivoire,São Tomé and Príncipe
After loading it to a
DataTable
object, the values becomeC�te d'Ivoire,S�o Tom� and Pr�ncipe
To Reproduce
Expected behavior The text with accents should be parsed as it comes from the input file