OData / odata.net

ODataLib: Open Data Protocol - .NET Libraries and Frameworks
https://docs.microsoft.com/odata
Other
675 stars 348 forks source link

Commit Utf8JsonWriter contents to buffer before writing directly to bufferwriter in StartStreamValueScopeAsync #2981

Closed ElizabethOkerio closed 1 month ago

ElizabethOkerio commented 1 month ago

Issues

This pull request fixes #xxx.

Description

Briefly describe the changes of this pull request.

Checklist (Uncheck if it is not completed)

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

habbes commented 1 month ago

@ElizabethOkerio found other gaps with ODataUtf8JsonWriter that we should close. You could make the judgement where there should be done in the same PR or separate. But it would be good if the patch is shipped with all of them addressed.

We don't have an implementation of TextWriter.Write(char). The default implementation is empty, so nothing is written when this method is called.

Here are some test cases that you may consider to get started:

// in JsonWriterBaseTests
[Theory]
[InlineData("application/json", 'a', "a")]
[InlineData("text/html", 'a', "\"a\"")]
[InlineData("text/plain", 'a', "\"a\"")]
// JSON special char
[InlineData("application/json", '"', "\"")]
[InlineData("text/html", '"', "\"\\\"\"")]
[InlineData("text/plain", '"', "\"\\\"\"")]
// non-ascii
[InlineData("application/json", '你', "你")]
[InlineData("text/html", '你', "\"你\"")]
[InlineData("text/plain", '你', "\"你\"")]
public void TextWriter_CorrectlyWritesSingleCharacter(string contentType, char value, string expectedOutput)
{
    using MemoryStream stream = new MemoryStream();
    IJsonWriter jsonWriter = CreateJsonWriter(stream, isIeee754Compatible: false, Encoding.UTF8);
    var tw = jsonWriter.StartTextWriterValueScope(contentType);
    tw.Write(value);
    jsonWriter.EndTextWriterValueScope();
    jsonWriter.Flush();

    stream.Seek(0, SeekOrigin.Begin);

    using StreamReader reader = new StreamReader(stream, encoding: Encoding.UTF8);
    string rawOutput = reader.ReadToEnd();
    Assert.Equal(expectedOutput, rawOutput);
}

For the following tests on handling of surrogate pairs, I found that JsonWriter and Utf8JsonWriter handles string escaping differently, but both outputs will be parsed to the same thing.

// ODataUtf8JsonWriterTests
[Theory]
// both the escaped and non-escaped versions are valid
// and compliant JSON parsers should be able to handle both
[InlineData("application/json", "🐂")]
[InlineData("text/html", "\"\\uD83D\\uDC02\"")]
[InlineData("text/plain", "\"\\uD83D\\uDC02\"")]
public void TextWriter_CorrectlyHandlesSurrogatePairs(string contentType, string expectedOutput)
{
    using MemoryStream stream = new MemoryStream();
    IJsonWriter jsonWriter = CreateJsonWriter(stream, isIeee754Compatible: false, Encoding.UTF8);
    var tw = jsonWriter.StartTextWriterValueScope(contentType);
    tw.Write('\ud83d');
    tw.Write('\udc02');
    jsonWriter.EndTextWriterValueScope();
    jsonWriter.Flush();

    stream.Seek(0, SeekOrigin.Begin);

    using StreamReader reader = new StreamReader(stream, encoding: Encoding.UTF8);
    string rawOutput = reader.ReadToEnd();
    Assert.Equal(expectedOutput, rawOutput);
}
// JsonWriterTests

        [Theory]
        [InlineData("application/json", "🐂")]
        [InlineData("text/html", "\"🐂\"")]
        [InlineData("text/plain", "\"🐂\"")]
        public void TextWriter_CorrectlyHandlesSurrogatePairs(string contentType, string expectedOutput)
        {
            using MemoryStream stream = new MemoryStream();
            IJsonWriter jsonWriter = CreateJsonWriter(stream, isIeee754Compatible: false, Encoding.UTF8);
            var tw = jsonWriter.StartTextWriterValueScope(contentType);
            tw.Write('\ud83d');
            tw.Write('\udc02');
            jsonWriter.EndTextWriterValueScope();
            jsonWriter.Flush();

            stream.Seek(0, SeekOrigin.Begin);

            using StreamReader reader = new StreamReader(stream, encoding: Encoding.UTF8);
            string rawOutput = reader.ReadToEnd();
            Assert.Equal(expectedOutput, rawOutput);
        }

Of course we should also have async versions of the tests.

To make implementation easier, we put the char in buffer and reuse the existing Write methods:

public override void Write(char value)
{
    if (!this.jsonWriter.isWritingJson)
    {
        ReadOnlySpan<char> charToWrite = stackalloc char[1] { value };
        this.WriteCharsInChunks(charToWrite);
    }
    else
    {
        ReadOnlySpan<char> charToWrite = stackalloc char[1] { value };
        this.WriteCharsInChunksWithoutEscaping(charToWrite);
    }
}

In the async version we could used a pooled buffer that's cached and released when the writer is disposed.

habbes commented 1 month ago

@ElizabethOkerio another gap that we should close: support for float.PositiveInfinity, float.NegativeInfinity and float.NaN. We also ready support this for double, so we just need to do the same thing for float. Both sync and async.

See current handling for doubles: https://github.com/OData/odata.net/blob/main/src/Microsoft.OData.Core/Json/ODataUtf8JsonWriter.cs#L281

And sample test cases: https://github.com/OData/odata.net/blob/main/test/FunctionalTests/Microsoft.OData.Core.Tests/Json/ODataUtf8JsonWriterTests.cs#L116-L131