dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

JsonConverter.Read call inside another converter throws InvalidOperationException: Cannot skip tokens on partial JSON. #74108

Closed dlyz closed 1 year ago

dlyz commented 2 years ago

Description

Key points:

Reproduction Steps

Minimal repro test ```cs using System; using System.IO; using System.Text; using System.Text.Json; using System.Text.Json.Serialization; using System.Threading.Tasks; using Xunit; public class NestedConverterTest { [JsonConverter(typeof(Converter))] private class MyClass { public InnerClass? Inner { get; set; } } private class InnerClass { public string? Value { get; set; } } private class Converter : JsonConverter { private JsonConverter GetConverter(JsonSerializerOptions options) { return (JsonConverter)options.GetConverter(typeof(InnerClass)); } public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { var inner = GetConverter(options).Read(ref reader, typeof(InnerClass), options); return new MyClass { Inner = inner }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { GetConverter(options).Write(writer, value.Inner!, options); } } [Fact] public async Task ReadBigStreamWithExcessProps() { const int count = 1000; var stream = new MemoryStream(); stream.Write(Encoding.UTF8.GetBytes("[")); for (int i = 0; i < count; i++) { if (i != 0) { stream.Write(Encoding.UTF8.GetBytes(",")); } stream.Write(Encoding.UTF8.GetBytes(@"{""Missing"":""missing-value"",""Value"":""value""}")); } stream.Write(Encoding.UTF8.GetBytes("]")); stream.Position = 0; var result = await JsonSerializer.DeserializeAsync(stream); Assert.Equal(count, result!.Length); for (int i = 0; i < count; i++) { Assert.Equal("value", result[i].Inner?.Value); } } } ```

Expected behavior

Should successfully deserialize as it does with smaller input JSON or when there are no excessive properties in JSON.

Actual behavior

For .NET 6:

  Message: 
System.Text.Json.JsonException : The JSON value could not be converted to NestedConverterTest+MyClass. Path: $[0] | LineNumber: 0 | BytePositionInLine: 12.
---- System.InvalidOperationException : Cannot skip tokens on partial JSON. Either get the whole payload and create a Utf8JsonReader instance where isFinalBlock is true or call TrySkip.

  Stack Trace: 
ThrowHelper.ReThrowWithPath(ReadStack& state, Utf8JsonReader& reader, Exception ex)
JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
JsonSerializer.ReadCore[TValue](JsonReaderState& readerState, Boolean isFinalBlock, ReadOnlySpan`1 buffer, JsonSerializerOptions options, ReadStack& state, JsonConverter converterBase)
JsonSerializer.ContinueDeserialize[TValue](ReadBufferState& bufferState, JsonReaderState& jsonReaderState, ReadStack& readStack, JsonConverter converter, JsonSerializerOptions options)
JsonSerializer.ReadAllAsync[TValue](Stream utf8Json, JsonTypeInfo jsonTypeInfo, CancellationToken cancellationToken)
NestedConverterTest.ReadBigStreamWithExcessProps() line 63
--- End of stack trace from previous location ---
----- Inner Stack Trace -----
ThrowHelper.ThrowInvalidOperationException_CannotSkipOnPartial()
ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
JsonResumableConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
Converter.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options) line 32
JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
JsonCollectionConverter`2.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, TCollection& value)
JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)

Regression?

No response

Known Workarounds

The workaround is to use JsonSerializer.Deserialize instead of acquiring a converter from JsonSerializerOptions and calling JsonConverter.Read, but this workaround may be significantly slower (see the benchmark below). Workaround may be applied conditionally when Utf8JsonReader.IsFinalBlock is false.

Configuration

Reproduces from .NET 5 to .NET 7 preview 7. Older versions don't work either, but for other reasons.

Other information

Custom converters are always called with full current JSON entity buffered (see https://github.com/dotnet/runtime/issues/39795#issuecomment-663082176), and looks like ObjectDefaultConverter is aware of that and chooses "fast path" (if (!state.SupportContinuation && !state.Current.CanContainMetadata)), but Utf8JsonReader.Skip method fails anyway because it checks for _isFinalBlock which is false.

I think that this use case (to deserialize a part of JSON entity with default converter, inside another custom converter) is quite usual (for example I use it in my custom "known types" converter, tuple converter and others). The question is why not to use JsonSerializer.Deserialize. The answer is in the benchmark below. ReadCvtRead (with nested converter call) is about 1.5 times faster then ReadDeserialize (with JsonSerializer.Deserialize call).

Benchmark code ```cs using System; using System.Text.Json.Serialization; using System.Text.Json; using BenchmarkDotNet.Attributes; public class NestedConverterBenchmark { [Benchmark] public string WriteCvtWrite() { return JsonSerializer.Serialize(_model, _optionsCvt); } [Benchmark] public string WriteSerialize() { return JsonSerializer.Serialize(_model, _optionsSerializer); } [Benchmark] public MyClass ReadCvtRead() { return JsonSerializer.Deserialize>(_json, _optionsCvt)!; } [Benchmark] public MyClass ReadDeserialize() { return JsonSerializer.Deserialize>(_json, _optionsSerializer)!; } private static readonly JsonSerializerOptions _optionsCvt = new JsonSerializerOptions { Converters = { new CvtConverter() } }; private static readonly JsonSerializerOptions _optionsSerializer = new JsonSerializerOptions { Converters = { new SerializeConverter() } }; private static readonly MyClass _model = new MyClass { Value = new InnerClass { Prop = "prop-value" } }; private static readonly string _json = @"{""Prop"":""prop-value""}"; public class MyClass { public T? Value { get; set; } } public class InnerClass { public string? Prop { get; set; } } private class CvtConverter : JsonConverterFactory { public override JsonConverter? CreateConverter(Type typeToConvert, JsonSerializerOptions options) { var innerType = typeToConvert.GetGenericArguments()[0]; var innerConverter = options.GetConverter(innerType); return (JsonConverter)Activator.CreateInstance( typeof(Impl<>).MakeGenericType(innerType), innerConverter )!; } public override bool CanConvert(Type typeToConvert) { return typeToConvert.IsConstructedGenericType && typeToConvert.GetGenericTypeDefinition() == typeof(MyClass<>); } private class Impl : JsonConverter> { private readonly JsonConverter _innerConverter; public Impl(JsonConverter innerConverter) { _innerConverter = innerConverter; } public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { return new MyClass { Value = _innerConverter.Read(ref reader, typeof(T), options)! }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { _innerConverter.Write(writer, value.Value!, options); } } } private class SerializeConverter : JsonConverter> { public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { return new MyClass { Value = JsonSerializer.Deserialize(ref reader, options)! }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { JsonSerializer.Serialize(writer, value.Value, options); } } } ```
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
12th Gen Intel Core i5-12600K, 1 CPU, 16 logical and 10 physical cores
.NET SDK=6.0.400
  [Host]     : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT
  DefaultJob : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT

|          Method |     Mean |   Error |  StdDev |
|---------------- |---------:|--------:|--------:|
|   WriteCvtWrite | 183.3 ns | 2.39 ns | 2.23 ns |
|  WriteSerialize | 183.9 ns | 2.42 ns | 2.26 ns |
|     ReadCvtRead | 230.8 ns | 2.44 ns | 2.28 ns |
| ReadDeserialize | 364.1 ns | 4.15 ns | 3.88 ns |
ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.

Issue Details
### Description Key points: - we have a custom `JsonConverter` - inside its `Read` method we call `Read` of another converter acquired with `JsonSerializerOptions.GetConverter(typeof(InnerClass))` - converter for `InnerClass` is an `ObjectDefaultConverter` - we asynchronously deserializing a JSON payload, that is quite large i.e. can not be placed in the single buffer - JSON input contains some objects that are meant to be deserialized into `InnerClass` with properties that are not represented in the InnerClass (excessive properties). ### Reproduction Steps
Minimal repro test ```cs using System; using System.IO; using System.Text; using System.Text.Json; using System.Text.Json.Serialization; using System.Threading.Tasks; using Xunit; public class NestedConverterTest { [JsonConverter(typeof(Converter))] private class MyClass { public InnerClass? Inner { get; set; } } private class InnerClass { public string? Value { get; set; } } private class Converter : JsonConverter { private JsonConverter GetConverter(JsonSerializerOptions options) { return (JsonConverter)options.GetConverter(typeof(InnerClass)); } public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { var inner = GetConverter(options).Read(ref reader, typeof(InnerClass), options); return new MyClass { Inner = inner }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { GetConverter(options).Write(writer, value.Inner!, options); } } [Fact] public async Task ReadBigStreamWithExcessProps() { const int count = 1000; var stream = new MemoryStream(); stream.Write(Encoding.UTF8.GetBytes("[")); for (int i = 0; i < count; i++) { if (i != 0) { stream.Write(Encoding.UTF8.GetBytes(",")); } stream.Write(Encoding.UTF8.GetBytes(@"{""Missing"":""missing-value"",""Value"":""value""}")); } stream.Write(Encoding.UTF8.GetBytes("]")); stream.Position = 0; var result = await JsonSerializer.DeserializeAsync(stream); Assert.Equal(count, result!.Length); for (int i = 0; i < count; i++) { Assert.Equal("value", result[i].Inner?.Value); } } } ```
### Expected behavior Should successfully deserialize as it does with smaller input JSON or when there are no excessive properties in JSON. ### Actual behavior For .NET 6: ```text Message:  System.Text.Json.JsonException : The JSON value could not be converted to NestedConverterTest+MyClass. Path: $[0] | LineNumber: 0 | BytePositionInLine: 12. ---- System.InvalidOperationException : Cannot skip tokens on partial JSON. Either get the whole payload and create a Utf8JsonReader instance where isFinalBlock is true or call TrySkip. Stack Trace:  ThrowHelper.ReThrowWithPath(ReadStack& state, Utf8JsonReader& reader, Exception ex) JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state) JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state) JsonSerializer.ReadCore[TValue](JsonReaderState& readerState, Boolean isFinalBlock, ReadOnlySpan`1 buffer, JsonSerializerOptions options, ReadStack& state, JsonConverter converterBase) JsonSerializer.ContinueDeserialize[TValue](ReadBufferState& bufferState, JsonReaderState& jsonReaderState, ReadStack& readStack, JsonConverter converter, JsonSerializerOptions options) JsonSerializer.ReadAllAsync[TValue](Stream utf8Json, JsonTypeInfo jsonTypeInfo, CancellationToken cancellationToken) NestedConverterTest.ReadBigStreamWithExcessProps() line 63 --- End of stack trace from previous location --- ----- Inner Stack Trace ----- ThrowHelper.ThrowInvalidOperationException_CannotSkipOnPartial() ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) JsonResumableConverter`1.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options) Converter.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options) line 32 JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) JsonCollectionConverter`2.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, TCollection& value) JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value) JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state) ``` ### Regression? _No response_ ### Known Workarounds The workaround is to use `JsonSerializer.Deserialize` instead of acquiring a converter from `JsonSerializerOptions` and calling `JsonConverter.Read`, but this workaround may be significantly slower (see the benchmark below). ### Configuration Reproduces from .NET 5 to .NET 7 preview 7. Older versions don't work either, but for other reasons. ### Other information Custom converters are always called with full current JSON entity buffered (see https://github.com/dotnet/runtime/issues/39795#issuecomment-663082176), and looks like `ObjectDefaultConverter` is aware of that and [chooses "fast path"](https://github.com/dotnet/runtime/blob/f142128e89b63577a9bbba7e2b760ec82102a7a9/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Object/ObjectDefaultConverter.cs#L26) (`if (!state.SupportContinuation && !state.Current.CanContainMetadata)`), but `Utf8JsonReader.Skip` method fails anyway because [it checks](https://github.com/dotnet/runtime/blob/f142128e89b63577a9bbba7e2b760ec82102a7a9/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs#L312) for `_isFinalBlock` which is `false`. I think that this use case (to deserialize a part of JSON entity with default converter, inside another custom converter) is quite usual (for example I use it in my custom "known types" converter, tuple converter and others). The question is why not to use `JsonSerializer.Deserialize`. The answer is in the benchmark below. `ReadCvtRead` (with nested converter call) is about 1.5 times faster then `ReadDeserialize` (with `JsonSerializer.Deserialize` call).
Benchmark code ```cs using System; using System.Text.Json.Serialization; using System.Text.Json; using BenchmarkDotNet.Attributes; public class NestedConverterBenchmark { [Benchmark] public string WriteCvtWrite() { return JsonSerializer.Serialize(_model, _optionsCvt); } [Benchmark] public string WriteSerialize() { return JsonSerializer.Serialize(_model, _optionsSerializer); } [Benchmark] public MyClass ReadCvtRead() { return JsonSerializer.Deserialize>(_json, _optionsCvt)!; } [Benchmark] public MyClass ReadDeserialize() { return JsonSerializer.Deserialize>(_json, _optionsSerializer)!; } private static readonly JsonSerializerOptions _optionsCvt = new JsonSerializerOptions { Converters = { new CvtConverter() } }; private static readonly JsonSerializerOptions _optionsSerializer = new JsonSerializerOptions { Converters = { new SerializeConverter() } }; private static readonly MyClass _model = new MyClass { Value = new InnerClass { Prop = "prop-value" } }; private static readonly string _json = @"{""Prop"":""prop-value""}"; public class MyClass { public T? Value { get; set; } } public class InnerClass { public string? Prop { get; set; } } private class CvtConverter : JsonConverterFactory { public override JsonConverter? CreateConverter(Type typeToConvert, JsonSerializerOptions options) { var innerType = typeToConvert.GetGenericArguments()[0]; var innerConverter = options.GetConverter(innerType); return (JsonConverter)Activator.CreateInstance( typeof(Impl<>).MakeGenericType(innerType), innerConverter )!; } public override bool CanConvert(Type typeToConvert) { return typeToConvert.IsConstructedGenericType && typeToConvert.GetGenericTypeDefinition() == typeof(MyClass<>); } private class Impl : JsonConverter> { private readonly JsonConverter _innerConverter; public Impl(JsonConverter innerConverter) { _innerConverter = innerConverter; } public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { return new MyClass { Value = _innerConverter.Read(ref reader, typeof(T), options)! }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { _innerConverter.Write(writer, value.Value!, options); } } } private class SerializeConverter : JsonConverter> { public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) { return new MyClass { Value = JsonSerializer.Deserialize(ref reader, options)! }; } public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options) { JsonSerializer.Serialize(writer, value.Value, options); } } } ```
```text BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000 12th Gen Intel Core i5-12600K, 1 CPU, 16 logical and 10 physical cores .NET SDK=6.0.400 [Host] : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT DefaultJob : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT | Method | Mean | Error | StdDev | |---------------- |---------:|--------:|--------:| | WriteCvtWrite | 183.3 ns | 2.39 ns | 2.23 ns | | WriteSerialize | 183.9 ns | 2.42 ns | 2.26 ns | | ReadCvtRead | 230.8 ns | 2.44 ns | 2.28 ns | | ReadDeserialize | 364.1 ns | 4.15 ns | 3.88 ns | ```
Author: dlyz
Assignees: -
Labels: `area-System.Text.Json`, `untriaged`
Milestone: -
layomia commented 2 years ago

@dlyz do you have a repro project including a sample JSON payload that's failing, type graph, and the custom converters?

Per notes above, this doesn't seem like a regression (same behavior since .NET 5.0) so I'm marking 8.0 for now. FWIW we are considering a feature to provide custom async converters which could help with perf - https://github.com/dotnet/runtime/issues/63795.

ghost commented 2 years ago

This issue has been marked needs-author-action and may be missing some important information.

dlyz commented 2 years ago

@dlyz do you have a repro project including a sample JSON payload that's failing, type graph, and the custom converters?

Yes, I have it in the collapsed area under Reproduction Steps in the original post. Tried not to bloat the post, but looks like I overdid it =)

layomia commented 1 year ago

Not a regression but needs a look in .NET 8.

mansellan commented 1 year ago

Just wanted to say thanks to @dlyz - this issue and the benchmark code saved me a huge headache!

eiriktsarpalis commented 1 year ago

It seems that the serializer is failing to read ahead the entire value before passing to the custom converter in the context of collection. We should try to fix this.

eiriktsarpalis commented 1 year ago

I've pushed #89637 that should fix this for .NET 8. In the meantime you could work around the issue by using the JsonSerializer methods instead in your custom converter:

class Converter : JsonConverter<MyClass>
{
    public override MyClass? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        var inner = JsonSerializer.Deserialize<InnerClass>(ref reader, options);
        return new MyClass { Inner = inner };
    }

    public override void Write(Utf8JsonWriter writer, MyClass value, JsonSerializerOptions options)
        => JsonSerializer.Serialize(writer, value, options);
}

The above will ensure that the right initializations are performed for the Utf8JsonReader that is being passed to the serializer.