microsoft / kiota-abstractions-dotnet

Abstractions library for the Kiota generated SDKs in dotnet
https://aka.ms/kiota/docs
MIT License
24 stars 21 forks source link

Problem with serializers and their MemoryStreams #228

Closed MihaMarkic closed 1 month ago

MihaMarkic commented 2 months ago

Serializers (well, json one) use internally MemoryStream and I don't think MemoryStream ever shrink their buffers. They just grow them by factor of 2 when required. And serializers are being reused (along with their buffers). Hence, there is a possibility of using a large chunk of memory and never releasing it - if at some point in time on serializes a huge object. One possible solution would be to create a new MemoryStream instance each time, but then there could be a lot of GC going on with all those buffers being released. A better option would be to use RecyclableMemoryStream

baywet commented 2 months ago

Hi @MihaMarkic Thank you for bringing this up. Design wise, we made a mistake early on we're planning to correct for a future major version. The ISerializationWriter interface should not have a GetSerializedContent method. And RequestInformation's body shuold instead be a reference to a callback/function that accepts a writeable stream. This way the request adapter could call the callback and have the stream being written directly to the network (through the buffer offered by the APIs), reducing copies and memory pressure.

In the meantime, if you observe high memory pressure effects, replacing MemoryStream by recyclable ones should reduce those effects. I believe the place that'd benefit the most from such a change is probably the Json serialization writer

Besides memory pressure, you should not be seeing unclosed Memory streams unless the client application is doing something wrong or unless we missed out something. If you see those, please trace them and point them to us.

MihaMarkic commented 2 months ago

@baywet That's even better, indeed. And no, I'm just thinking out loud, don't have such issues ATM.

baywet commented 2 months ago

Would you still be willing to submit a PR in the json serialization repo, and back it up with benchmarks (just for serialization)? I suspect the reduced memory assignment/garbage collection should yield performance benefits.

MihaMarkic commented 2 months ago

@baywet You mean a variant with RecyclableMemoryStream?

baywet commented 2 months ago

I meant for the json serialization writer:

  1. make the changes in a local copy
  2. setup a benchmark project, referencing both the local copy and the published version (you might have to temporary change the namespace of the local copy to avoid conflicts)
  3. run the benchmark, see if we're getting better perf for the changes.

Does that make sense?

microsoft-github-policy-service[bot] commented 1 month ago

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.