JoshClose / CsvHelper

Library to help reading and writing CSV files
http://joshclose.github.io/CsvHelper/
Other
4.65k stars 1.05k forks source link

Reading writing dynamic csv causes the first line to drop #2205

Closed lanchapman closed 2 months ago

lanchapman commented 8 months ago

When reading and writing a CSV file as a dynamic object, if headers are enabled, the first line after the header is dropped.

  using var readstream = new StreamReader("input.csv");
  using var writestream = new StringWriter();
  using var csvreader = new CsvReader(readstream, CultureInfo.InvariantCulture);
  using var csvwriter = new CsvWriter(writestream, CultureInfo.InvariantCulture);
  var records = csvreader.GetRecords<dynamic>();
  csvwriter.WriteRecords(records);
  File.WriteAllText("output.csv", writestream.ToString());
input.csv
t1,t2,t3
1,2,3
4,5,6
7,8,9

The files input.csv and output.csv should be the same.

The output.csv shows

t1,t2,t3
4,5,6
7,8,9
AltruCoder commented 8 months ago

This is an issue with how CsvReader.GetRecords<T>() works. https://joshclose.github.io/CsvHelper/getting-started/

The GetRecords<T> method will return an IEnumerable<T> that will yield records.

If you take a look at the internal private bool WriteHeader<T>(IEnumerable<T> records), you will see that if it knows the class, it can WriteHeader(recordType). If, however, it isn't a standard class, such as dynamic, it must read the first record with WriteHeader(records.FirstOrDefault()). That FirstOrDefault(), unfortunately calls the first yielded result from CsvReader and there is no way to go back to the beginning of the IEnumerable<dynamic>.

private bool WriteHeader<T>(IEnumerable<T> records)
{
    if (!hasHeaderRecord || hasHeaderBeenWritten)
    {
        return false;
    }

    var recordType = typeof(T);
    var isPrimitive = recordType.GetTypeInfo().IsPrimitive;
    if (!isPrimitive && recordType != typeof(object))
    {
        WriteHeader(recordType);
        return hasHeaderBeenWritten;
    }

    return WriteHeader(records.FirstOrDefault());
}

You can call ToList() on csvreader.GetRecords<dynamic>() and this will solve the issue. It does, however, mean pulling the whole list into memory first before writing it.

using (var readstream = new StringReader("t1,t2,t3\n1,2,3\n4,5,6\n7,8,9"))
using (var csvreader = new CsvReader(readstream, CultureInfo.InvariantCulture))
using (var csvwriter = new CsvWriter(Console.Out, CultureInfo.InvariantCulture))
{
    var records = csvreader.GetRecords<dynamic>().ToList();
    csvwriter.WriteRecords(records);
}

Unfortunately, I don't think there is a good way for @JoshClose to fix this issue without making major and likely unwelcome changes to how CsvReader works.

jzabroski commented 2 months ago

Unfortunately, I don't think there is a good way for @JoshClose to fix this issue without making major and likely unwelcome changes to how CsvReader works.

I would throw an exception saying its an invalid CsvHelper configuration.

JoshClose commented 2 months ago

This is actually fixed now because of https://github.com/JoshClose/CsvHelper/issues/2247.

void Main()
{
    var s = """
        t1,t2,t3
        1,2,3
        4,5,6
        7,8,9
        """;
    using var readstream = new StringReader(s);
    using var writestream = new StringWriter();
    using var csvreader = new CsvReader(readstream, CultureInfo.InvariantCulture);
    using var csvwriter = new CsvWriter(writestream, CultureInfo.InvariantCulture);
    var records = csvreader.GetRecords<dynamic>();
    csvwriter.WriteRecords(records);
    writestream.Dump();
}
image

Unfortunately, I don't think there is a good way for @JoshClose to fix this issue without making major and likely unwelcome changes to how CsvReader works.

I had to use an IEnumerator from the IEnumerable instead. It's a bit strange in the code, but actually cleaned up some things.