vincentlaucsb / csv-parser

A high-performance, fully-featured CSV parser and serializer for modern C++.
MIT License
901 stars 150 forks source link

Stats only process first 5000 rows #155

Closed TobyEalden closed 3 years ago

TobyEalden commented 3 years ago

Firstly, thanks for the work on this repo!

I've noticed an issue with the stats calculation, in that it only seems to process the first chunk (5000 rows).

There are various reasons for this, firstly there is no break here and reader.eof() (see the loop termination check) will return true if the parser has processed the entire file even if the reader hasn't iterated through all rows yet.

There's also a problem with creating the counters in that if the loop were to work it would end up creating the counters every time.

Lastly the reason the test passes is because it checks that the mean age of the persons.csv is 42, but if you look at that file it appears that the mean age is 42 after about 100 rows and stays around there - presumably the data was auto-generated around that value.