Sets ODataUtf8JsonWriter as the default IJsonWriter used in ODataMessageWriter.
Sets JavaScriptEncoder.UnsafeRelaxedJsonEscaping as the default JavaScriptEncoder for ODataUtf8JsonWriter.
Fixes some bugs with ODataUtf8JsonWriter streaming (see details at the end)
Normalizes JSON output for some edge cases for better consistency with JsonWriter (see details at the end).
Both of these changes aim to improve default serialization performance. ODataUtf8JsonWriter has demonstrated better performance and memory efficiency over the current default JsonWriter in benchmarks and in production workloads. One of the major blockers to it becoming the default was lack of support for streaming, but that has been addressed by #2880
The main concern about this change is that the JSON is different between the different writers as far as character escaping is concerned.
In any case, while the output is different, both are valid JSON and semantically equivalent. Compliant JSON parsers should interpret them the same way.
Differences between ODataUtf8JsonWriter and JsonWriter
Category
Description
Examples
String Escaping
Utf8JsonWriter escapes more characters in a string than the default Jsonwriter. By default, Utf8JsonWriter escapes even HTML-unsafe characters like <. However, you can override that by using the UnsafeRelaxedJsonEscaping as documented here: https://learn.microsoft.com/en-us/odata/odatalib/using-ut8jsonwriter-for-better-performance#choosing-a-javascriptencoder. That said, even with this encoding, there will be some differences in character escaping, but all are valid UTF-8 encoded strings.
Utf8JsonWriter uses uppercase letters for unicode code points, JsonWriter uses lowercase letters: (JsonWriter: "Cust1 \ud800\udc05 \u00e4" vs Utf8JsonWriter: "Cust1 \uD800\uDC05 \u00E4"). Encoder differences JsonWriter: "CityA1 'A1' + 3 <>" vs Utf8JsonWriter with relaxed encoder: "CityA1 'A1' + 3 <>" vs Utf8JsonWriter with default encoder: "CityA1 \u0027A1\u0027 \u002B 3 \u003C\u003E")
Number formatting
Utf8JsonWriter serializes the decimal 1.0M as 1.0, but JsonWriter serializes it as 1 (the opposite of the previous scenario).
JsonWriter: 1 vs Utf8JsonWriter: 1.0
We also used to have a difference in DateTimeOffset formatting because Utf8JsonWriter uses +00:00 timezone suffix when the timezone offset is 0 (e.g. 2022-11-09T09:42:30+00:00) whereas JsonWriter uses Z (e.g. 2022-11-09T09:42:30Z). However, we addressed this in a past PR because we got feedback that this one has higher chances of breaking clients (not following standards). This type of difference is also something customers have raised in the past. Now both JsonWriter and ODataUtf8JsonWriter use the Z suffix.
Justification for changing the default encoder to JavaScriptEncoder.UnsafeRelaxedJsonEscaping:
None of the built-in encoder matches exactly any of the ODataStringEscapeOption used by JsonWriter. So there's no reason to prefer JavaScriptEncoder.Default based on consistency. It escapes more characters than either of the string escape options available for JsonWriter.
Performance: serializing strings that need to be escaped uses more CPU and allocates more memory than strings that don't need escaping. Therefore, an encoder that is less likely to escape strings is more efficient.
This encoder is used by default in AspNetCore, therefore this is more consistent with JSON-based APIs written in modern ASP.NET Core.
The security concerns that make JavaScriptEncoder.Default (e.g. escaping HTML special characters) do not apply if we don't dump the output directly in a HTML document. Since our use case is REST APIs, it's up to the client to escape the response if they intend to render their contents in a web page.
Main changes:
Registered ODataUtf8JsonWriterFactory as the default implementation of IJsonWriterFactory in AddDefaultODataServices extension method.
I've renamed DefaultJsonWriterFactory to ODataJsonWriterFactory since it's no longer the default.
Changed the default JavaScriptEncoder of ODataUtf8JsonWriter to JavaScriptEncoder.UnsafeRelaxedJsonEscaping.
Update strings in tests that were escaped using JavaScriptEncoder.Default to match the output returned by the new encoder
Replaced DefaultJsonWriterFactory with ODataUtf8JsonWriterFactory in a few tests.
Fix bug in ODataUtf8JsonWriter.CreateStreamValueScopeAsync. Writing to the stream asynchronously would overwrite the content previously written by the JsonWriter and result in jumbled corrupted output. I fixed this by making sure the contents of the Utf8JsonWriter are committed to the buffer before starting the stream scope. This was correctly implemented and tested in the sync version, but not the async version.
Fixed a bug in ODataUtf8JsonWriter.TextWriter. The TextWriter did not override Write(char), as a result calling Write(char) would not actually write the char. I've overridden both TextWriter.Write(char) and TextWriter.WriteAsync(char).
I've added support for float.PositiveInfinity ( -> "INF"), float.NegativityInfinity (->"-INF") andfloat.NaN(->"NaN") toODataUtf8JsonWriter. Support was already there fordouble, but notfloat(an exception would be thrown whenUtf8JsonWritertries to writefloat.NaN` for example)
I've changed the implementation of ODataUtf8JsonWriter.WriteValue(double) such that it adds a .0 when the value does not have a decimal point (e.g. 234.0 gets written as 234.0 instead of 234). This is not required by the spec, but it's consistent with JsonWriter current behaviour. I spoke to @mikepizzo and he suggested it could be a left-over of OData v3. We can drop the .0, potentially with a feature flag. I thought it wise to keep the behaviour consistent for now to avoid disruptions since it broke a lot of our internal tests that checked for the presense of .0 in the output.
Checklist (Uncheck if it is not completed)
[x] Test cases added
[x] Build and test with one-click build and test script passed
Additional work necessary
If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.
Issues
This pull request fixes #2822
Description
This PR:
ODataUtf8JsonWriter
as the defaultIJsonWriter
used inODataMessageWriter
.JavaScriptEncoder.UnsafeRelaxedJsonEscaping
as the defaultJavaScriptEncoder
forODataUtf8JsonWriter
.ODataUtf8JsonWriter
streaming (see details at the end)JsonWriter
(see details at the end).Both of these changes aim to improve default serialization performance.
ODataUtf8JsonWriter
has demonstrated better performance and memory efficiency over the current defaultJsonWriter
in benchmarks and in production workloads. One of the major blockers to it becoming the default was lack of support for streaming, but that has been addressed by #2880The main concern about this change is that the JSON is different between the different writers as far as character escaping is concerned.
In any case, while the output is different, both are valid JSON and semantically equivalent. Compliant JSON parsers should interpret them the same way.
Differences between
ODataUtf8JsonWriter
andJsonWriter
JsonWriter
: "Cust1 \ud800\udc05 \u00e4" vsUtf8JsonWriter
: "Cust1 \uD800\uDC05 \u00E4"). Encoder differencesJsonWriter
: "CityA1 'A1' + 3 <>" vsUtf8JsonWriter
with relaxed encoder: "CityA1 'A1' + 3 <>" vsUtf8JsonWriter
with default encoder: "CityA1 \u0027A1\u0027 \u002B 3 \u003C\u003E")Utf8JsonWriter
serializes the decimal1.0M
as1.0
, butJsonWriter
serializes it as1
(the opposite of the previous scenario).JsonWriter
:1
vsUtf8JsonWriter
:1.0
We also used to have a difference in
DateTimeOffset
formatting becauseUtf8JsonWriter
uses+00:00
timezone suffix when the timezone offset is 0 (e.g.2022-11-09T09:42:30+00:00
) whereasJsonWriter
usesZ
(e.g.2022-11-09T09:42:30Z
). However, we addressed this in a past PR because we got feedback that this one has higher chances of breaking clients (not following standards). This type of difference is also something customers have raised in the past. Now bothJsonWriter
andODataUtf8JsonWriter
use theZ
suffix.Justification for changing the default encoder to
JavaScriptEncoder.UnsafeRelaxedJsonEscaping
:ODataStringEscapeOption
used byJsonWriter
. So there's no reason to preferJavaScriptEncoder.Default
based on consistency. It escapes more characters than either of the string escape options available forJsonWriter
.JavaScriptEncoder.Default
(e.g. escaping HTML special characters) do not apply if we don't dump the output directly in a HTML document. Since our use case is REST APIs, it's up to the client to escape the response if they intend to render their contents in a web page.Main changes:
ODataUtf8JsonWriterFactory
as the default implementation ofIJsonWriterFactory
inAddDefaultODataServices
extension method.DefaultJsonWriterFactory
toODataJsonWriterFactory
since it's no longer the default.JavaScriptEncoder
ofODataUtf8JsonWriter
toJavaScriptEncoder.UnsafeRelaxedJsonEscaping
.JavaScriptEncoder.Default
to match the output returned by the new encoderDefaultJsonWriterFactory
withODataUtf8JsonWriterFactory
in a few tests.ODataUtf8JsonWriter.CreateStreamValueScopeAsync
. Writing to the stream asynchronously would overwrite the content previously written by the JsonWriter and result in jumbled corrupted output. I fixed this by making sure the contents of theUtf8JsonWriter
are committed to the buffer before starting the stream scope. This was correctly implemented and tested in the sync version, but not the async version.ODataUtf8JsonWriter.TextWriter
. TheTextWriter
did not overrideWrite(char)
, as a result callingWrite(char)
would not actually write the char. I've overridden bothTextWriter.Write(char)
andTextWriter.WriteAsync(char)
.float.PositiveInfinity
( ->"INF"
),float.NegativityInfinity (->
"-INF") and
float.NaN(->
"NaN") to
ODataUtf8JsonWriter. Support was already there for
double, but not
float(an exception would be thrown when
Utf8JsonWritertries to write
float.NaN` for example)ODataUtf8JsonWriter.WriteValue(double)
such that it adds a.0
when the value does not have a decimal point (e.g.234.0
gets written as234.0
instead of234
). This is not required by the spec, but it's consistent withJsonWriter
current behaviour. I spoke to @mikepizzo and he suggested it could be a left-over of OData v3. We can drop the.0
, potentially with a feature flag. I thought it wise to keep the behaviour consistent for now to avoid disruptions since it broke a lot of our internal tests that checked for the presense of.0
in the output.Checklist (Uncheck if it is not completed)
Additional work necessary
If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.