OData / odata.net

ODataLib: Open Data Protocol - .NET Libraries and Frameworks
https://docs.microsoft.com/odata
Other
687 stars 349 forks source link

`ODataUtf8JsonWriter` should not flush the internal stream when buffer is full. #3099

Closed habbes closed 2 weeks ago

habbes commented 3 weeks ago

Short summary (3-5 sentences) describing the issue.

ODataUtf8JsonWriter has a default internal buffer of 16k. When it gets full, it writes the data to the underlying output stream and then flushes the stream. It's not always desirable for ODataUtf8JsonWriter to decide when to flush the stream. Depending on the underlying stream, flushing might be expensive (e.g. expensive network call) and those could lead to considerably high latencies. Instead, ODataUtf8JsonWriter should only write the drained buffer contents to the underlying stream but not flush it. This lets the underlying stream decide if/when it should flush itself. If the user wants to force flushing, they should call FlushAsync explicitly.

This issue affected a customer who experience high latencies for payloads with large string values due to frequent flushing of the network stream. They created a stream wrapper on top of their stream that intercepts and buffers writes using a much larger buffer (1MB buffer size), but since ODataUtf8JsonWriter calls flush anyway, this custom buffering is bypassed and large latencies persist. I created a branch and nightly build that the customer used to verify that removing FlushAsync from FlushIfBufferThresholdReached helps fix the issue: https://github.com/OData/odata.net/blob/write-stream-instead-of-flush/src/Microsoft.OData.Core/Json/ODataUtf8JsonWriter.cs#L1290

Assemblies affected

Microsoft.OData.Core 8.x and 7.x

Reproduce steps

Write a payload with a string property containing a large value (e.g. 4MB worth of characters). Use a Stream with a relatively "slow" FlushAsync implementation (e.g. an expensive network call or simulate a custom stream with a delay).

Expected result

When the buffer is full, ODataUtf8JsonWriter should only write to the stream, but not flush it. It should only flush when Flush() or FlushAsync() is called explicitly. This allows the stream to decide when its appropriate to flush, and the customer can override and contrl the behaviour with a custom Stream implementation.

Actual result

The string is written out in chunks and the stream is flushed each time the buffer gets full and drained. This leads to a relatively high latency because the calls to FlushAsync.