Open jzabroski opened 1 week ago
The idea is, you should never have a null
record. The issue here is ReadAsync
should be returning false
.
Does this sound reasonable?
As far as the ReadHeader
thing, that code hasn't changed in years. You should be calling Read
before ReadHeader
, but there could probably be a better exception message.
That would work for me. Can you push a fix today
Working on it right now.
I have a dilemma.
The way things work is row by row. That means you call Read
every time you want a new row. You can then act on the given row. If you have a header record, you're supposed to call ReadHeader
, which reads the current row as headers. Then you call you Read
again to get to the first record. This allows the user to handle each row however they like. When using GetRecords
, everything is handled for you.
The issue here is that GetRecord
is calling ReadHeader
for you if you haven't called it yet. That was a convenience for people, but presents a problem here, because there might not be another row. Returning null
is a bad way to handle that. When enabling nullable
it becomes a pain to handle null records when there is almost no chance of getting one.
I think the correct solution here is to have GetRecord
stop calling ReadHeader
and Read
for you. This may be a little annoying, but it makes things consistent.
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = csv.GetRecord<Foo>();
}
The other option is to have Read
call ReadHeader
and read a second time, returning false
when there isn't another record. I don't like this idea because it doesn't allow the user to do any other inspection on the header record and automatically goes past it. If you're using Read
instead of GetRecords
, you've opted into a more manual way of reading.
What do you think?
I think if CsvConfiguration specifies there is a header record, it should do it for you.
ReadHeader should add a remark that it should only be used for highly customized or niche scenarios, and is primarily intended for internal use when CsvConfiguration.HasHeaderRecord is true.
Otherwise, what is the point of the CsvConfiguration.HasHeaderRecord property?
what is the point of the CsvConfiguration.HasHeaderRecord property?
Only used for GetRecords
.
what is the point of the CsvConfiguration.HasHeaderRecord property?
Only used for
GetRecords
.
Then ReadHeader should add a remarks that it requires CsvConfiguration.HasHeaderRecord to be true, but allows the consumer to directly read.
And yes, your Read fix should work.
In my use case, we read and validate one row at a time with FluentValidation, so that we can parse a batch of records in really large CSV files, validate it, and save to a db only the good rows and report the bad rows. Otherwise I'm not certain we would need to read one row at a time.
A lot of financial data is zipped csv files, so we deflate, which is memory intensive, and to avoid running out of memory, as well as scaling db insert/update/delete operations, we process in batches of 1000.
Describe the bug I don't know why the code is now throwing an exception when it used to return null - the commit where this occurred just says it was done to remove nullability - but why when that is very useful. There is no clear use case for this behavior, and it causes an exception to be thrown on an empty file, which is highly non-desirable behavior.
In any event, the lack of migration guide since v30 to now is also making upgrading difficult and questionable. We're on v31 but given the other bug I filed, I'm not planning to upgrade further.
To Reproduce https://github.com/JoshClose/CsvHelper/commit/7dbb3fe710466ec2f573f81e19c7130ec1c7a365#diff-e1cbba681f102065fada3e91fecbfbc458195a01974189a3483f75e72137afdaL733-L736
Fails with the following:
Empty.csv in CsvHelperBug.csproj
SampleMap.cs
SampleLine.cs
CsvHelperBug/Test.cs
Additionally, the following fails with "No header record was found" (wtf?):
Expected behavior Roll this back! It doesn't make any sense in the perfectly valid scenario where there are no rows.
Screenshots n/a
Additional context n/a