Open alexandrnikitin opened 7 years ago
We've sent and received quite a lot different size and rates of messages without noticing any measurable consequences. That being said, I can see the potential for benefit, and the RecyclableMemoryStream looks to be a drop-in replacement.
I don't have time to make this change myself right now, but if you want to submit a PR, I'll check it out and then integrate it.
-Thunder
It is already optimized. MemoryStream does not do allocation, instead it just wraps byte array network buffer. https://github.com/ntent-ad/kafka4net/blob/48f702ef4e9162060a2f415d4008ad26b08b5288/src/Protocols/Serializer.cs#L361
In its turn, byte array is preallocated per tcp connection: https://github.com/ntent-ad/kafka4net/blob/master/src/Protocols/ResponseCorrelation.cs#L64
and can be reallocated if request is larger than current size: https://github.com/ntent-ad/kafka4net/blob/master/src/Protocols/ResponseCorrelation.cs#L86
All of this is possible because kafka protocol was designed with allocation overhead in mind. So all structures are prefixed with size fields, which allows to do only once allocation and avoid reallocations.
Thank you for replies. @vchekan You pointed to deserialization logic. It's OK there. I meant serialization part, see link in the first post. Here's a couple of allocation profile results from a production application:
Ah, serialization. You are right. It does allocate for every message. Which makes me thinking, why it is not the same way compression buffers are allocated (thread-static).
Another thing to consider is caller's payload serialization. As driver accepts byte arrays only, caller will do their own memory allocation and then kafka4net driver will do copy from caller's payload buffer into its internal memory stream. This might be optimized too, but will require some callbacks and API changes/extensions. But as first step, I agree, pooled memory stream should be an improvement.
MemoryStream allocates new memory each time, in case of large messages it goes to LOH. An example code: https://github.com/ntent-ad/kafka4net/blob/a2c96aa987ddf3aa73446b02bfc0c1635a1a09b2/src/Protocols/Serializer.cs#L152
Proposed solution is
Microsoft.IO.RecyclableMemoryStream
https://github.com/Microsoft/Microsoft.IO.RecyclableMemoryStreamWDYT?