Open dzmitry-lahoda opened 5 years ago
Interesting idea, though I recently had another perspective on this by @vpenades and I've swapped this for a List
I think he is referring to pooling the internal array itself.
Regarding MemoryStream, I did an additional research and I found that after all, MemoryStream is not that slow, at least on Net Core.
It seems MemoryStream has been completely refurbished in net core; you can compare how different they are in NetFramework and NetCore.
In essence, both List and MemoryStream work in the same way; internally they grow a Byte[] array as the user writes to it. The additional overhead of MemoryStream comes from overriding System.IO.Stream.
But by using a Byte[] array it means that the array is reallocated and copied when it grows, so the "old" array needs to be garbage collected, which adds some overhead too.
Over time there's a number of exotic solutions for this issue, for example:
There's RecyclableMemoryStream from Microsoft, which is what I think @dzmitry-lahoda is refering to.
Then, there's also System.Buffers.ArrayPool
Yep, to state yet another way, using lightweight-serialization
would be considered by people if it declared (and proved by BenchmarkDotNet) low GC. The way is use pooling, even for a list, like https://github.com/jtmueller/Collections.Pooled
I really appreciate your thoughts guys. Realizing that array pooling isn't necessarily more efficient for small arrays has slowed my thinking. I've taken a few minor steps inspired by this discussion that have improved performance:
1) For VLQ encoding, using a Byte array as a buffer wrapped in an ArraySegment. The buffer isn't trimmed until the last moment resulting in a single copy. https://github.com/invertedtomato/lightweight-serialization/blob/master/Library/LightWeightSerialization/UnsignedVlq.cs#L32 2) Converted Nodes to structs. https://github.com/invertedtomato/lightweight-serialization/blob/master/Library/LightWeightSerialization/Node.cs Interestingly this resulted in a ballpark 14% speed improvement in my basic testing scenario. I take it that this is saving GC quite a bit of work.
I'll ponder this further a give more thoughts when they come shortly.
@invertedtomato You can probably replace
ArraySegment<Byte>[] EncodeCache = new ArraySegment<Byte>[255];
with
ArrayPool<Byte> EncodeCache = ArrayPool<Byte>.Create();
then you can use ArrayPool.Rent and ArrayPool.Return , which happens to work with ArraySegment
In that particular case buffers are used in multiple locations concurrently. For example if the value 3 is VLQ encoded, the buffer containing the encoded equivalent is used anywhere the value 3 is used for the whole of the runtime session. So while Renting is possible, Return doesn't make sense. Also the buffers are small (10 bytes) and will have negligible performance advantage.
I am pondering swapping the Streams throughout for ArraySegments because the lengths are now largely known in advance, and it would save a stack of double copying.
On Thu, 28 Feb 2019 at 21:45, Vicente Penades notifications@github.com wrote:
@invertedtomato https://github.com/invertedtomato You can probably replace
ArraySegment
[] EncodeCache = new ArraySegment [255]; with
ArrayPool EncodeCache = ArrayPool.Create();
then you can use ArrayPool.Rent and ArrayPool.Return , which happens to work with ArraySegment too
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/invertedtomato/lightweight-serialization/issues/1#issuecomment-468243383, or mute the thread https://github.com/notifications/unsubscribe-auth/AAq6Oe72pM2hPLG7QJRIov05w8-QiFooks5vR8FfgaJpZM4bVR5J .
--
*Ben *Thompson
+61 4 1121 5410
https://github.com/invertedtomato/lightweight-serialization/blob/33a995aa8ef8cb5dd747839c3fae1262cfb602ac/Library/LightWeightSerialization/UnsignedVlq.cs#L30