dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

Allow custom converters for dictionary keys #46520

Closed NN--- closed 3 years ago

NN--- commented 3 years ago

Edited by @layomia.

Original post by @NN--- (click to view) ### Description JsonConverterAttribute is considered when used in a collection or as Dictionary value. However, it is not considered when it is used as Dictionary key. ```cs using System; using System.Collections.Generic; using System.Text.Json; using System.Text.Json.Serialization; public class OptionConverter : JsonConverter

Today custom converters provided for types that appear in input graphs as dictionary keys (mostly primitives like string, numeric types, guids etc) cannot be used to handle dictionary keys. This manifests as NotSupportedException being thrown by the serializer when a custom converter is used for these primitive types & and the types are serialized as dictionary keys. We should provide API to override the NSE-throwing behavior and allow custom converters to handle dictionary keys.

In .NET 5, the (de)serialization of dictionary keys was handled entirely by the serializer even when custom converters were provided for the dictionary key types. This behavior was broken earlier in the .NET 6 timeline as a known side effect of internal STJ infrastructure changes. Now, custom converters are also invoked for dictionary keys, but the mechanism to support them is internal and defaults to throwing NotSupportedException for all converters that are not internal in STJ & are supported as dictionary keys.

API Proposal

This involves exposing the following internal virtual methods in the following form:

public abstract class JsonConverter<T> : JsonConverter
{
  public virtual bool HandleNull => throw null;

  public abstract T? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options);

  public abstract void Write(Utf8JsonWriter writer, T value, JsonSerializerOptions options);

+ protected virtual T? ReadFromPropertyName(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) => throw null;

+ protected virtual void WriteToPropertyName(Utf8JsonWriter writer, T value, JsonSerializerOptions options) { }
}

Notes

API usage

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text.Json;
using System.Text.Json.Serialization;

namespace Test
{
    class Program
    {
        internal sealed class CustomStringConverter : JsonConverter<string>
        {
            public override string Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
            {   
            Debug.Assert(reader.TokenType == JsonTokenType.String);
                return reader.GetString();
        }

            public override void Write(Utf8JsonWriter writer, string value, JsonSerializerOptions options)
                => writer.WriteStringValue(value);

            public override string ReadFromPropertyName(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
            {
                Debug.Assert(reader.TokenType == JsonTokenType.PropertyName);
                return reader.GetString();
            }

            protected override void WriteToPropertyName(Utf8JsonWriter writer, string value, JsonSerializerOptions options)
                => writer.WritePropertyName(value);
        }

        static void Main()
        {
            JsonSerializerOptions  options = new() { Converters = { new CustomStringConverter() } };
            Dictionary<string, string> value = new() { ["key"] = "value" };

            string json = JsonSerializer.Serialize(value, options);
            Console.WriteLine(json); // {"key":"value"}
        }
    }
}
layomia commented 3 years ago

The larger feature here is allowing custom converters for dictionary keys. This was discussed as part of the design for allowing non-string dictionary keys in .NET 5 - https://github.com/dotnet/runtime/pull/32676. The most likely route to achieving this is exposing these internal methods on JsonConverter<T>:

https://github.com/dotnet/runtime/blob/3116c4d1d52d59e402a0b420853b3b26ea6405eb/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonConverterOfT.cs#L542-L546

cc @Jozkee

layomia commented 3 years ago

Moving this to future for now, but I'm ready to react if anyone needs this as a result of https://github.com/dotnet/runtime/pull/50074.

orcun commented 3 years ago

Exposing these virtual functions is what I exactly need. My converter is actually doing just that already and it was frustrating to find this out. Converting a type to string is pretty useful and I assume very common. Would love to see that on 6.0.0 🙏

layomia commented 3 years ago

From @martincostello in https://github.com/dotnet/runtime/issues/53241. This feature should provide a reasonable mitigation for the breaking change described in that issue.

Description

In a line-of-business application we have the following custom JsonConverter for strings used to remove "bad" Unicode from strings in serialized JSON.

internal sealed class BadUnicodeRemovingJsonConverter : JsonConverter<string>
{
    public override string Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
        => reader.GetString();

    public override void Write(Utf8JsonWriter writer, string value, JsonSerializerOptions options)
    {
        static string RemoveBadUnicode(string input, string pattern)
        {
            return Regex.Replace(input, pattern, string.Empty, RegexOptions.IgnoreCase);
        }

        string sanitized = RemoveBadUnicode(value, "[\u0000-\u0009]");
        sanitized = RemoveBadUnicode(sanitized, "[\u000b-\u000c]");
        sanitized = RemoveBadUnicode(sanitized, "[\u000e-\u001f]");

        writer.WriteStringValue(sanitized);
    }
}

After updating the application to .NET 6 preview 4, a NotSupportedException is thrown with a message similar to:

System.NotSupportedException: The type 'System.String' is not a supported dictionary key using converter of type 'MyApp.BadUnicodeRemovingJsonConverter'. Path: $.Foo.Bar.

For the specific type being serialized, $.Foo.Bar is an IDictionary<string, string> property Bar on a custom type Foo, which is in turn a property of another custom type at the root.

public class RootObject
{
    public ChildObject Foo { get; set; }
}

public class ChildObject
{
    public IDictionary<string, string> Bar { get; set; }
}

The object is serialized with the following settings:

var options = new JsonSerializerOptions
{
    WriteIndented = false,
};

options.Converters.Add(new BadUnicodeRemovingJsonConverter());

string json = JsonSerializer.Serialize(rootObject, options);

Configuration

.NET 6 preview 4 (SDK version 6.0.100-preview.4.21255.9)

Regression?

Yes, this code works today with .NET 5.0.6. The issue was uncovered by the application's tests when updated to .NET 6.0 preview 4. The issue was also not present in previous previews of .NET 6.

Other information

    System.NotSupportedException: The type 'System.String' is not a supported dictionary key using converter of type 'MyApp.BadUnicodeRemovingJsonConverter'. Path: $.Foo.Bar.
     ---> System.NotSupportedException: The type 'System.String' is not a supported dictionary key using converter of type 'MyApp.BadUnicodeRemovingJsonConverter'.
       at System.Text.Json.ThrowHelper.ThrowNotSupportedException_DictionaryKeyTypeNotSupported(Type keyType, JsonConverter converter) in System.Text.Json.dll:token 0x60000f2+0x16
       at System.Text.Json.Serialization.JsonConverter`1.WriteWithQuotes(Utf8JsonWriter writer, T value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60006fd+0x0
       at System.Text.Json.Serialization.Converters.IDictionaryOfTKeyTValueConverter`3.OnWriteResume(Utf8JsonWriter writer, TCollection value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x6000868+0xc3
       at System.Text.Json.Serialization.Converters.DictionaryDefaultConverter`3.OnTryWrite(Utf8JsonWriter writer, TCollection dictionary, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x6000840+0x70
       at System.Text.Json.Serialization.JsonConverter`1.TryWrite(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60006f6+0x206
       at System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1.GetMemberAndWriteJson(Object obj, WriteStack& state, Utf8JsonWriter writer) in System.Text.Json.dll:token 0x60007b7+0x130
       at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryWrite(Utf8JsonWriter writer, T value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60008d8+0xa7
       at System.Text.Json.Serialization.JsonConverter`1.TryWrite(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60006f6+0x206
       at System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1.GetMemberAndWriteJson(Object obj, WriteStack& state, Utf8JsonWriter writer) in System.Text.Json.dll:token 0x60007b7+0x130
       at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryWrite(Utf8JsonWriter writer, T value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60008d8+0xa7
       at System.Text.Json.Serialization.JsonConverter`1.TryWrite(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60006f6+0x206
       at System.Text.Json.Serialization.JsonConverter`1.WriteCore(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state) in System.Text.Json.dll:token 0x60006e1+0x0
       --- End of inner exception stack trace ---
eiriktsarpalis commented 3 years ago

I would propose renaming the methods to something that better conveys their purpose, e.g. WriteAsDictionaryKey/ReadAsDictionaryKey.

Fydar commented 3 years ago

I would propose renaming the methods to something that better conveys their purpose, e.g. WriteAsDictionaryKey/ReadAsDictionaryKey.

I would also argue that ReadAsDictionaryKey would be imply that it's reading the current token as a dictionary key type; rather than reading a dictionary key as an object. This would be simply fixed by using ReadFromDictionaryKey instead.

I would agree though, there is potential for a renaming. I would suggest any from the following:

layomia commented 3 years ago

@eiriktsarpalis @Fydar thanks. Read/WriteAsDictionary or similar better convey the intent of those APIs. Will update.

terrajobst commented 3 years ago

Video

namespace System.Text.Json.Serialization
{
    public partial class JsonConverter<T>
    {
        protected virtual T ReadAsDictionaryKey(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options);
        protected virtual void WriteAsDictionaryKey(Utf8JsonWriter writer, [DisallowNull] T value, JsonSerializerOptions options);
    }
}
eiriktsarpalis commented 3 years ago

We cannot have null values as dictionary keys, so the HandleNull property is not consulted when invoking the Read/WriteAsDictionaryKey methods. For the same reason, wrt to nullability, we have a signature of T ReadAsDictionaryKey(...) and not T? ReadAsDictionaryKey.... Similarly we [DisallowNull] for the input T in the corresponding write method.

It might be worth pointing out that not all IDictionary implementations guarantee that keys cannot be null. While all System.Collections implementations I know of do enforce this invariant, it is possible to have a dictionary implementation that does support null keys (for instance it could be relying on EqualityComparer<TKey>.Default which supports null). This is precisely what F# maps are doing and I wouldn't be too surprised if other third party implementations did the same thing. If anything, this would suggest to me that this feature would also cater to users that need a way to serialize null keys:

public class MyStringConverter : JsonConverter<string>
{
    public override void WriteToPropertyName(Utf8JsonWriter writer, string value, JsonSerializerOptions options)
    {
        writer.WritePropertyName(value ?? "LOL my dictionary has a null key");
    }
}

We should consider a generic "to string" conversion so that the serializer can do The Right Thing

On closer inspection I tend to agree with @layomia's remarks that materializing the string might have important performance ramifications. I temporarily considered the possibility of an API that converts to ReadOnlySpan<byte>, but ultimately I think this pushes too much complexity on implementers without necessarily being better than the original API proposal. Therefore I think we should just stick to the original proposal, modulo naming:

public class JsonConverter<T>
{
+    protected virtual T? ReadFromPropertyName(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options);
+    protected virtual void WriteToPropertyName(Utf8JsonWriter writer, T value, JsonSerializerOptions options);
}
steveharter commented 3 years ago

Per offline discussion, if we can fix the regression for all key types (not just string) with minimal impact to the trimmed STJ.dll then that should be sufficient for V6 especially given the release schedule and priority.

The new APIs to read\write dictionary keys as strings have low priority since there is no known feedback where these new APIs are needed and\or blocking, although there are valid scenarios for them.

kirsanium commented 3 years ago

@steveharter I am following this issue here because it is really blocking for my company - we have some business logic rely on that. It messes with our code and there is no workaround atm.

eiriktsarpalis commented 3 years ago

Hi @kirsanium, would your company be unblocked if #53241 were addressed, or does this specifically concern adding custom key converter support?

kirsanium commented 3 years ago

@eiriktsarpalis, it's just about JsonConverter<string>

bartonjs commented 3 years ago
namespace System.Text.Json.Serialization
{
    partial class JsonConverter<T>
    {
        protected virtual T? ReadAsPropertyName(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) => throw null;
        protected virtual void WriteAsPropertyName(Utf8JsonWriter writer, T value, JsonSerializerOptions options) { }
    }
}