Closed adamsitnik closed 5 years ago
I have been on holiday so haven't been able to follow this PR until now. Will have a look at all the stuff that was talked about tomorrow, I think
@irmen I have pushed all the code fixes, the code is ready for review.
I hope to be able to review all this in the following few days, but it's clear to me that you guys are a lot more knowledgeable about these low level optimizations than I am. So I don't think I'll be able to comment on the function, but rather only on the form (if any)
thanks a lot for these PRs and it's exiting that it makes Pyrolite a key library in .NET Spark
@irmen thank you for the awesome library!
BTW please don't release a new NuGet package yet, tomorrow I plan to finish a PR that improves the Pickler performance.
Preview:
Method | Count | Mean WAS | Mean IS | WAS Alloc | IS Allocated |
---|---|---|---|---|---|
DoublesToByteArray | 100 | 3.138 us | 0.911 us | 3184 B | 304 B |
StringsToByteArray | 100 | 22.235 us | 16.828 us | 16424 B | 10280 B |
DoublesToByteArray | 1000 | 31.298 us | 7.269 us | 42105 B | 304 B |
StringsToByteArray | 1000 | 234.243 us | 169.381 us | 180100 B | 102306 B |
@adamsitnik Those are incredible results! I'm very curious what wizardry you're performing here
I'm just in the process right now of spicing up the readme.
When the Unpickler reads from a
Stream
it has non-trivial overhead compared to when reading from a memory buffer. This PR improves unpickling performance for that case.The difference
stream.ReadByte
with an additional check (if the byte is not-1
) is justarray[position++]
.stream.Read(buffer, 0, size)
is justarray.Slice(0, size)
. This helps to avoid callingArray.Copy
and copying the data more than needed.Perf trick: the types that implement
IInputReader
are structs to enable for full interface method devirtualization and inlining. You can read about it here. I have added the necessary comment in the code that explains that.The gains are quite big. From 20% to 2 times faster when unpickling from array. Unpickling from Stream has not been regressed. I added micro-benchmarks to measure and ensure this.
/cc @stephentoub, @imback82, @eerhardt, @rapoth