Closed nietras closed 10 months ago
With .NET 8 and server GC perf is insane.
All with .NET 8 and server GC.
Good grief. I want a CsvHelper facade around Sep so that I can have the code completion goodness of CsvHelper with the Barry Allen Flash speed of Sep
Good grief. I want a CsvHelper facade around Sep so that I can have the code completion goodness of CsvHelper with the Barry Allen Flash speed of Sep
Man, that would be nice. A lot of the features would have to go away though. I've been thinking of making a version that would take advantages of these speed improvements, but I would probably want to do it from scratch and go .NET 7/8 forward only. If only I had some free time.
Maybe we can do it together. I have often wanted "SpreadsheetHelper", too. It's too damn hard using libraries like Aspose.Cells. The code always looks like crap. I just want a tagless-final API for describing how to format objects, and let a serialization layer like Sep do the hard work.
It would be nice to have a more generic nice to use use API on top of other implementations. It could support a lot more than CSV files. Message me if you want to talk more about it.
Thanks, I definitely see Sep as a "low-level" fast API for CSV files and imagine others could use it as a building block for more top-level things. The API fits my/my works needs. I don't need object mapping and generally don't consider it that important, you can code stuff like that in minutes and it will be faster and more flexible that way anyway. 😊 With LLMs getting faster I assume one can just ask an LLM to "map to type" going forward anyway.
https://blog.ploeh.dk/2023/12/04/serialization-with-and-without-reflection/
I don't think the ask is to "map to type", per say. I also don't necessarily see object mapping as serialization. Often, they feed into one another, the same way entity configuration in an ORM feeds into the object materialization layer and persistence ordering when uniquing the object graph. A generic mapping layer is the sine qua non of maintainable systems, imho. I don't think LLM spitting out Designer files as glue is going to eliminate that - they would just need to learn how to code to the generic mapping layer, still, and that still needs to be created.
I can definitely see a use for Sep without CsvHelper, too, like "wow" demoes for loading huge amounts of data. In the past I have used kdb+ (world's fastest time series database, by a large margin, used in bulge bracket finance institutions) to load CSV data and analyze it extraordinarily fast.
A fun way to describe persistence is not as orthogonal, but rather hyperbolic, borrowing from geometrist David Hilbert and specifically his infinite-dimensional Hilbert spaces over non-Euclidean geometric systems. Orthogonal persistence captures persistence as inherent to the execution environment. - You can't assign infinite details to finite points. So, Mark's blog post is worthless to me because it's looking at persistence purely from a Euclidean viewpoint (just my perspective).
I've updated the blog. Sep multithreaded on server GC is indeed crazy fast. Nice work @nietras!
Awesome! And thank you. 👍
allowing the frontrunner Sep to not allocate extra for unescaping and get even greater performance
The blog mentions the above which can be misunderstood, Sep doesn't allocate when unescaping either, and is still blistering fast when unescaping, benchmarks in Sep show this. Also RecordParser doesn't do any kind of auto unescaping in the benchmark either. So it would be good if some of these passages where either removed or revised... 😊
Oops! Feel free to open a PR here with wording you think is best and I'll try to work it in! https://github.com/joelverhagen/joelverhagen.com/blob/master/_posts/2020-12-08-fastest-net-csv-parsers.md Sorry for the misunderstanding.
Thanks, I definitely see Sep as a "low-level" fast API for CSV files and imagine others could use it as a building block for more top-level things. The API fits my/my works needs. I don't need object mapping and generally don't consider it that important, you can code stuff like that in minutes and it will be faster and more flexible that way anyway. 😊 With LLMs getting faster I assume one can just ask an LLM to "map to type" going forward anyway.
https://blog.ploeh.dk/2023/12/04/serialization-with-and-without-reflection/
Thanks for this. I'm not really up to date on the latest .NET stuff as I do only React front-end work at my day job.
@joelverhagen using
iterationTime
to ensure fast impls get more "iterations" while keep iterations low for slow. I hope this means this is as fast as before and still reliable. Running this now locally, so will see how it looks soon.Results on my machine are not exactly reproducible/reliable for this long running benchmarking. If running Sep only I get: If running all:
Note how
Sep
(single-threaded) is noticably slower when running all, maybe due to thermal throttling. Not sure. Do not think this is due to BDN params.Sep is
21289/382 = 55x
faster than the slowest one.