joelverhagen / NCsvPerf

A test bench for various .NET CSV parsing libraries
https://www.joelverhagen.com/blog/2020/12/fastest-net-csv-parsers
MIT License
69 stars 14 forks source link

add RecordParser #30

Closed leandromoh closed 3 years ago

leandromoh commented 3 years ago

Hi! I saw your post and I would like to add my parser on benchmark, when this PR is finished.

joelverhagen commented 3 years ago

Hey @leandromoh, thanks for taking the time to open a PR. It looks like the CI is complaining about some missing types. Perhaps we're missing the package reference in the .csproj to https://www.nuget.org/packages/recordparser?

leandromoh commented 3 years ago

Perhaps we're missing the package reference in the .csproj

yes, I was working with a local reference, whilist the PR was in draft, to do some experiments..

It was a bit funny that your benchmark test the exactly opposite cenario that my lib was thought to resolve: when you want to get all column values as string before to parse to specific type (ex. int, DateTime, etc). I designed RecordParser exactly to avoid intermediary unnecessary string allocations, if a field is some value type, like int for example, I parse it without intermediary string, in a straight manner from ReadOnlySpan<byte> (from file) to ReadOnlySpan<char> and finally int. This way avoids unnecessary pressure for GC / memory allocations.

Fortunately the lib had a good result even in this unpredicted cenario, at least on my pc.


BenchmarkDotNet=v0.13.0, OS=Windows 10.0.18363.1440 (1909/November2019Update/19H2)
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=5.0.104
  [Host]     : .NET 5.0.4 (5.0.421.11614), X64 RyuJIT
  DefaultJob : .NET 5.0.4 (5.0.421.11614), X64 RyuJIT
Method LineCount Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
RecordParser 1000000 2.978 s 0.0591 s 0.1450 s 2.936 s 58000.0000 21000.0000 3000.0000 345 MB
CsvHelper 1000000 3.546 s 0.4463 s 1.2293 s 2.960 s 44000.0000 17000.0000 3000.0000 261 MB
Cursively 1000000 2.214 s 0.0406 s 0.0733 s 2.191 s 58000.0000 21000.0000 3000.0000 345 MB
Sylvan_Data_Csv 1000000 2.911 s 0.4912 s 1.4252 s 2.088 s 44000.0000 17000.0000 3000.0000 261 MB
leandromoh commented 3 years ago

@joelverhagen PR is ready for review