dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.18k stars 4.72k forks source link

System.Text.Json : Consider supporting F# discriminated unions #55744

Open eiriktsarpalis opened 3 years ago

eiriktsarpalis commented 3 years ago

Concerning support for discriminated unions, it might be worth pointing out that there is no one canonical way to encode DUs in JSON. DUs are used in many different ways in F# code, including the following:

In light of the above, I'm increasingly starting to think that System.Text.Json should not be providing a default mechanism for serializing DUs. Users would still be able to use available custom converters that provide the union encoding that suits their use case or perhaps use unions in their data contracts in a way that bypasses the serialization layer altogether.

Originally posted by @eiriktsarpalis in https://github.com/dotnet/runtime/issues/29812#issuecomment-878303600

If we do decide to support F# DUs in the future, it would likely be in the form of a publicly available converter factory à la JsonStringEnumConverter that configures one or more of the above alternative serialization formats.

ghost commented 3 years ago

Tagging subscribers to this area: @eiriktsarpalis, @layomia See info in area-owners.md if you want to be subscribed.

Issue Details
Concerning support for discriminated unions, it might be worth pointing out that there is no one canonical way to encode DUs in JSON. DUs are used in many different ways in F# code, including the following: - Single case DUs, e.g. `type Email = Email of string`, used to provide a type-safe wrapper for common values. A user might expect that `Email("email")` should serialize as its payload, `"email"`. - Type-safe enums, e.g. `type Suit = Heart | Spade | Diamond | Club`. Users might reasonably expect that `Spade` should serialize as the union case identifier, `"Spade"`. - Unions with arbitrary combinations of union cases and arities, e.g. `type Shape = Point | Circle of radius:float | Rectangle of width:float * length:float`. A value like `Circle(42.)` would require an encoding similar to `{ "shape" : "circle", "radius" : 42 }`. Users should be able to specify the case discriminator property name (`"shape"`) as well as the identifier for the union case (`Circle` mapping to `"circle"`). In light of the above, I'm increasingly starting to think that System.Text.Json should not be providing a default mechanism for serializing DUs. Users would still be able to use [available custom converters](https://github.com/Tarmil/FSharp.SystemTextJson/) that provide the union encoding that suits their use case or perhaps use unions in their data contracts in a way that [bypasses the serialization layer altogether](https://eiriktsarpalis.wordpress.com/2018/10/30/a-contract-pattern-for-schemaless-datastores/). _Originally posted by @eiriktsarpalis in https://github.com/dotnet/runtime/issues/29812#issuecomment-878303600_ If we do decide to support F# DUs in the future, it would likely be in the form of a publicly available converter factory à la [JsonStringEnumConverter](https://docs.microsoft.com/en-us/dotnet/api/system.text.json.serialization.jsonstringenumconverter?view=net-5.0) that configures one or more of the above alternative serialization formats.
Author: eiriktsarpalis
Assignees: -
Labels: `area-System.Text.Json`, `untriaged`
Milestone: -
eiriktsarpalis commented 3 years ago

Tagging @bartelink @zaid-ajaj @ninofloris who might have opinions on the matter.

Zaid-Ajaj commented 3 years ago

it might be worth pointing out that there is no one canonical way to encode DUs in JSON.

IMO just because there are multiple ways to represent DUs in JSON doesn't imply that a library shouldn't pick a default. I think many F# devs would appreciate a choice that works out of the box with the possibility of customization where required, not having to customize or having to understand library internals to get started with it.

This is the approach I implemented in Fable.Remoting.Json which is a Newtonsoft.Json converter for F# types without loss of information (possible to round trip). Given a Shape type like

type Shape = 
  | Point 
  | Circle of radius:float 
  | Rectangle of width:float * length:float

The following JSON is generated

Point => { "Point": [] } | "Point"
Circle(20.0) => { "Circle": [20.0] }
Rectangle(12.0, 10.0) => { "Rectangle": [12.0, 10.0] }  

of course there could be a setting to choose to write out the property names when provided

Circle(20.0) => { "Circle": { "radius": 20.0 } }
Rectangle(12.0, 10.0) => { "Rectangle": { "width": 12.0, "height": 10.0 } }
bartelink commented 3 years ago

I agree [with the OP] that there cannot and should not be a default canonical implementation; As FSharp.SystemTextJson and the default encoding provided by Newtonsoft.Json >=6 illustrate, there are many choices for how to represent things, and there is no reasonable default IMO.

I tend to take view that individual converters that do easy to describe things are the way to go. Thus I would ideally like to see the following in the box:

  1. TypeSafeEnumConverter: example unoptimized impl
  2. UnionConverter: IME the format you proposed in the OP ({ "discriminator": "value", <named fields from case as for a class/record, same for cases in tupled form> }) is common in the wild, and IME there are established implementations on the JVM, Swift and JS which work with that rendering format). See FsCodec UnionConverter impl. (There's also an OSS equivalent of this converter for Newtonsoft.Json that implements this scheme)
  3. I don't believe special case handling of single-case unions is something that should be supported
bartelink commented 3 years ago

Rectangle(12.0, 10.0) => { "Rectangle": [12.0, 10.0]}

While Newtonsoft.Json does this out of the box, I'd be very much against having this representation be a default. Furthermore, I think its a harmful thing to even put in the box in the first place; let me try to justify this:

In order to guard against things that IME happen in the wild, I'd propose the following rules:

Point => { "Point": [] } | "Point"

Again, IME, maintaining symmetry is valuable for nullary cases too. The main reason for this being that any consumer will be able to adapt to me adding a field to the case payload if it's rendered as { "shape": "point"}; one is simply adding a field to an object, versus transitioning from a string to an object.

I'd also mention another reason a canonical default handing is likely to run aground even if it was to be defined:

jhf commented 3 years ago

I agree with @bartelink that the compatibility of json encoded data can be a minefield, that requires attention when writing migrations, and when making API changes.

I still think that having a default canonical implementation provides a good user experience. I think it is too early to force the programmer understand and choose between serialisation strategies just because they wish to turn information into JSON.

Rather, when somebody has durable data, or needs to concern themselves with wire compatibility, then the way things are serialised may get important. In my case, when I upgrade durable data migrations, I transform the data, using the default canonical implementation, but I map from one type to another type, and write again using the default canonical implementation. For exposed API REST/json endpoints I either make sure it is a wire compatible change, or I make a new endpoint (probably with a version number).

If there is a large number of types involved, then the burden of having to specify a per type serialisation strategy, as a lot of extra work. Surely, yes, it will be more optimised, and possible upgrade proof, but since I'm writing migrations and making new endpoints, that point is moot, for me. Thus I'm forced to do more work, without any apparent benefit.

By having a default canonical implementation, and allowing overrides, it is possible to use json with a minimum amount of effort, and allow freedom when it has benefit.

By the way, what would be the developer experience when using Type Providers, such as https://fsprojects.github.io/SQLProvider/ when there is no default canonical implementation?

From my perspective, having a default canonical implementation, allows usage without con @bartelink

bartelink commented 2 years ago

@jhf @Zaid-Ajaj While I protested a lot, being able to have Unions convert without having to litter the code with Attributes is definitely something that's hard to give up once you've experienced it ;)

I implemented opt-in selection of such a policy https://github.com/jet/FsCodec/pull/69 - would appreciate any thoughts you might have and/or whether the fact that one can define an explicit encoding policy in ~200 LOC without the risk of forcing a necessarily opinionated encoding without people being aware

deyanp commented 2 years ago

What is the best (performance-wise) way currently (Jan 2022) to serialize/deserialize F# DUs which are "Type-safe enums" as per the initial post (not Single case DUs, no fields attached to the cases)? We use currently .NET Enums in our Api Dto layer because we had severe performance issues with DU serialization/deserialization performance in the past ... however we are missing the exhaustive match (at compilation) of the DUs ...

bartelink commented 2 years ago

@deyanp I've never benchmarked, but putting [<JsonConverter(TypeSafeEnumConverter<Type>) shoudl work and work well. The key lookup of the F# type info is cached, and I'd expect the determination of whether to apply the converter to be managed efficiently in STJ, i.e. only call CanConvert once

So I'd say its worth a benchmark - I'd be surprised if its in the same realm of perf as enums but can't imagine it being a disaster.

(would accept the benchmark as a PR if you see fit)

deyanp commented 2 years ago

@bartelink , hmm, I think I am doing something wrong, because the benchmark Enum <> DU brings very similar results (even though the code I saw in FsCodec.SystemTextJson is full of reflection and stuff):

Serialization:

// * Summary *

BenchmarkDotNet=v0.13.1, OS=ubuntu 21.10
Intel Core i9-10885H CPU 2.40GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.101
  [Host]   : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT DEBUG
  .NET 6.0 : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT

Job=.NET 6.0  Runtime=.NET 6.0  

| Method | size |     Mean |   Error |  StdDev |
|------- |----- |---------:|--------:|--------:|
|  Enum1 | 1000 | 261.0 ns | 1.03 ns | 0.91 ns |
|    DU1 | 1000 | 267.7 ns | 1.11 ns | 1.03 ns |
|  Enum5 | 1000 | 446.6 ns | 6.80 ns | 6.36 ns |
|    DU5 | 1000 | 442.6 ns | 1.84 ns | 1.72 ns |
| Enum10 | 1000 | 439.9 ns | 1.40 ns | 1.24 ns |
|   DU10 | 1000 | 462.6 ns | 1.35 ns | 1.13 ns |

// * Hints *
Outliers
  SerBenchmarks.Enum1: .NET 6.0  -> 1 outlier  was  removed (270.00 ns)
  SerBenchmarks.Enum10: .NET 6.0 -> 1 outlier  was  removed (444.64 ns)
  SerBenchmarks.DU10: .NET 6.0   -> 2 outliers were removed (466.17 ns, 466.62 ns)

// * Legends *
  size   : Value of the 'size' parameter
  Mean   : Arithmetic mean of all measurements
  Error  : Half of 99.9% confidence interval
  StdDev : Standard deviation of all measurements
  1 ns   : 1 Nanosecond (0.000000001 sec)

Deserialization:

// * Summary *

BenchmarkDotNet=v0.13.1, OS=ubuntu 21.10
Intel Core i9-10885H CPU 2.40GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.101
  [Host]   : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT DEBUG
  .NET 6.0 : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT

Job=.NET 6.0  Runtime=.NET 6.0  

| Method | size |       Mean |   Error |  StdDev |
|------- |----- |-----------:|--------:|--------:|
|  Enum1 | 1000 |   471.6 ns | 2.31 ns | 2.05 ns |
|    DU1 | 1000 |   477.6 ns | 1.56 ns | 1.38 ns |
|  Enum5 | 1000 |   922.0 ns | 3.76 ns | 3.33 ns |
|    DU5 | 1000 |   953.3 ns | 6.58 ns | 5.83 ns |
| Enum10 | 1000 | 1,432.8 ns | 5.06 ns | 4.74 ns |
|   DU10 | 1000 | 1,429.0 ns | 7.43 ns | 6.95 ns |

// * Hints *
Outliers
  DeserBenchmarks.Enum1: .NET 6.0 -> 1 outlier  was  removed (488.89 ns)
  DeserBenchmarks.DU1: .NET 6.0   -> 1 outlier  was  removed (484.40 ns)
  DeserBenchmarks.Enum5: .NET 6.0 -> 1 outlier  was  removed (929.99 ns)
  DeserBenchmarks.DU5: .NET 6.0   -> 1 outlier  was  removed (995.93 ns)

// * Legends *
  size   : Value of the 'size' parameter
  Mean   : Arithmetic mean of all measurements
  Error  : Half of 99.9% confidence interval
  StdDev : Standard deviation of all measurements
  1 ns   : 1 Nanosecond (0.000000001 sec

Code (as F# Console app, as FSX give error [^1]

// #r "nuget: FsCodec.SystemTextJson, 2.3.0"
// #r "nuget: BenchmarkDotNet, 0.13.1"

open System
open System.Globalization
open System.Text.Json
open System.Text.Json.Serialization
open FsCodec.SystemTextJson
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open BenchmarkDotNet.Jobs

type AccountingEntryTypeEnum = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum2 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum3 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum4 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum5 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum6 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum7 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum8 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum9 = 
| Debit = 0
| Credit = 1
type AccountingEntryTypeEnum10 = 
| Debit = 0
| Credit = 1

type AccountingEntryEnum1 = {
    Id : string
    Type : AccountingEntryTypeEnum
}

type AccountingEntryEnum5 = {
    Id : string
    Type : AccountingEntryTypeEnum
    Type2 : AccountingEntryTypeEnum2
    Type3 : AccountingEntryTypeEnum3
    Type4 : AccountingEntryTypeEnum4
    Type5 : AccountingEntryTypeEnum5
}

type AccountingEntryEnum10 = {
    Id : string
    Type : AccountingEntryTypeEnum
    Type2 : AccountingEntryTypeEnum2
    Type3 : AccountingEntryTypeEnum3
    Type4 : AccountingEntryTypeEnum4
    Type5 : AccountingEntryTypeEnum5
    Type6 : AccountingEntryTypeEnum6
    Type7 : AccountingEntryTypeEnum7
    Type8 : AccountingEntryTypeEnum8
    Type9 : AccountingEntryTypeEnum9
    Type10 : AccountingEntryTypeEnum10
}

[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU>>)>]
type AccountingEntryTypeDU = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU2>>)>]
type AccountingEntryTypeDU2 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU3>>)>]
type AccountingEntryTypeDU3 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU4>>)>]
type AccountingEntryTypeDU4 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU5>>)>]
type AccountingEntryTypeDU5 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU6>>)>]
type AccountingEntryTypeDU6= 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU7>>)>]
type AccountingEntryTypeDU7 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU8>>)>]
type AccountingEntryTypeDU8 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU9>>)>]
type AccountingEntryTypeDU9 = 
| Debit
| Credit
[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU10>>)>]
type AccountingEntryTypeDU10 = 
| Debit
| Credit

type AccountingEntryDU1 = {
    Id : string
    Type : AccountingEntryTypeDU
}

type AccountingEntryDU5 = {
    Id : string
    Type : AccountingEntryTypeDU
    Type2 : AccountingEntryTypeDU2
    Type3 : AccountingEntryTypeDU3
    Type4 : AccountingEntryTypeDU4
    Type5 : AccountingEntryTypeDU5
}

type AccountingEntryDU10 = {
    Id : string
    Type : AccountingEntryTypeDU
    Type2 : AccountingEntryTypeDU2
    Type3 : AccountingEntryTypeDU3
    Type4 : AccountingEntryTypeDU4
    Type5 : AccountingEntryTypeDU5
    Type6 : AccountingEntryTypeDU6
    Type7 : AccountingEntryTypeDU7
    Type8 : AccountingEntryTypeDU8
    Type9 : AccountingEntryTypeDU9
    Type10 : AccountingEntryTypeDU10
}

let serOptions = JsonSerializerOptions()
serOptions.DefaultIgnoreCondition <- JsonIgnoreCondition.WhenWritingNull
serOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
serOptions.WriteIndented <- true // not for prod maybe!?
serOptions.Converters.Add(JsonStringEnumConverter())

let deserOptions = JsonSerializerOptions()
deserOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
deserOptions.Converters.Add(JsonStringEnumConverter())

[<SimpleJob (RuntimeMoniker.Net60)>]
type SerBenchmarks() =
    // [<Params(100, 1000, 10000, 100000, 1000000)>]
    [<Params(1000)>]
    member val size = 0 with get, set
    member val x1: AccountingEntryEnum1 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
        }    
    member val x5: AccountingEntryEnum5 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
            Type2 = AccountingEntryTypeEnum2.Credit 
            Type3 = AccountingEntryTypeEnum3.Credit 
            Type4 = AccountingEntryTypeEnum4.Credit 
            Type5 = AccountingEntryTypeEnum5.Credit 
        }    
    member val x10 : AccountingEntryEnum10 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
            Type2 = AccountingEntryTypeEnum2.Credit 
            Type3 = AccountingEntryTypeEnum3.Credit 
            Type4 = AccountingEntryTypeEnum4.Credit 
            Type5 = AccountingEntryTypeEnum5.Credit 
            Type6 = AccountingEntryTypeEnum6.Credit 
            Type7 = AccountingEntryTypeEnum7.Credit 
            Type8 = AccountingEntryTypeEnum8.Credit 
            Type9 = AccountingEntryTypeEnum9.Credit 
            Type10 = AccountingEntryTypeEnum10.Credit 
        }   

    [<Benchmark>]
    member this.Enum1 () = 
        JsonSerializer.Serialize(this.x1, serOptions)

    [<Benchmark>]
    member this.DU1 () = 
        JsonSerializer.Serialize(this.x1, serOptions)

    [<Benchmark>]
    member this.Enum5 () = 
        JsonSerializer.Serialize(this.x5, serOptions)

    [<Benchmark>]
    member this.DU5 () = 
        JsonSerializer.Serialize(this.x5, serOptions)

    [<Benchmark>]
    member this.Enum10 () = 
        JsonSerializer.Serialize(this.x5, serOptions)

    [<Benchmark>]
    member this.DU10 () = 
        JsonSerializer.Serialize(this.x5, serOptions)

BenchmarkRunner.Run<SerBenchmarks>() |> ignore

[<SimpleJob (RuntimeMoniker.Net60)>]
type DeserBenchmarks() =
    // [<Params(100, 1000, 10000, 100000, 1000000)>]
    [<Params(1000)>]
    member val size = 0 with get, set

    member val xs1 : string = """{
    "id": "aa",
    "type": "Credit"
}"""
    member val xs5 : string = """{
    "id": "aa",
    "type": "Credit",
    "type2": "Credit",
    "type3": "Credit",
    "type4": "Credit",
    "type5": "Credit"
}"""
    member val xs10 : string = """{
    "id": "aa",
    "type": "Credit",
    "type2": "Credit",
    "type3": "Credit",
    "type4": "Credit",
    "type5": "Credit",
    "type6": "Credit",
    "type7": "Credit",
    "type8": "Credit",
    "type9": "Credit",
    "type10": "Credit"
}"""

    [<Benchmark>]
    member this.Enum1 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum1>(this.xs1, deserOptions)

    [<Benchmark>]
    member this.DU1 () = 
        JsonSerializer.Deserialize<AccountingEntryDU1>(this.xs1, deserOptions)

    [<Benchmark>]
    member this.Enum5 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum5>(this.xs5, deserOptions)

    [<Benchmark>]
    member this.DU5 () = 
        JsonSerializer.Deserialize<AccountingEntryDU5>(this.xs5, deserOptions)

    [<Benchmark>]
    member this.Enum10 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum10>(this.xs10, deserOptions)

    [<Benchmark>]
    member this.DU10 () = 
        JsonSerializer.Deserialize<AccountingEntryDU10>(this.xs10, deserOptions)

BenchmarkRunner.Run<DeserBenchmarks>() |> ignore

[^1] Error

System.IO.FileLoadException: Could not load file or assembly 'FsCodec.SystemTextJson, Version=2.0.0.0, Culture=neutral, PublicKeyToken=null'. Operation is not supported. (0x80131515)
File name: 'FsCodec.SystemTextJson, Version=2.0.0.0, Culture=neutral, PublicKeyToken=null'
 ---> System.NotSupportedException: The invoked member is not supported in a dynamic assembly.
   at System.Reflection.Emit.InternalAssemblyBuilder.get_Location()
   at System.Reflection.Assembly.LoadFromResolveHandler(Object sender, ResolveEventArgs args)
   at System.Runtime.Loader.AssemblyLoadContext.InvokeResolveEvent(ResolveEventHandler eventHandler, RuntimeAssembly assembly, String name)
   at System.Runtime.Loader.AssemblyLoadContext.OnAssemblyResolve(RuntimeAssembly assembly, String assemblyFullName)
   at System.Reflection.CustomAttribute._CreateCaObject(RuntimeModule pModule, RuntimeType type, IRuntimeMethodInfo pCtor, Byte** ppBlob, Byte* pEndBlob, Int32* pcNamedArgs)
   at System.Reflection.CustomAttribute.CreateCaObject(RuntimeModule module, RuntimeType type, IRuntimeMethodInfo ctor, IntPtr& blob, IntPtr blobEnd, Int32& namedArgs)
   at System.Reflection.CustomAttribute.AddCustomAttributes(ListBuilder`1& attributes, RuntimeModule decoratedModule, Int32 decoratedMetadataToken, RuntimeType attributeFilterType, Boolean mustBeInheritable, ListBuilder`1 derivedAttributes)
   at System.Reflection.CustomAttribute.GetCustomAttributes(RuntimeModule decoratedModule, Int32 decoratedMetadataToken, Int32 pcaCount, RuntimeType attributeFilterType)
   at System.Reflection.CustomAttribute.GetCustomAttributes(RuntimeType type, RuntimeType caType, Boolean inherit)
   at System.RuntimeType.GetCustomAttributes(Type attributeType, Boolean inherit)
   at System.Text.Json.JsonSerializerOptions.GetConverterInternal(Type typeToConvert)
   at System.Text.Json.JsonSerializerOptions.DetermineConverter(Type parentClassType, Type runtimePropertyType, MemberInfo memberInfo)
   at System.Text.Json.Serialization.Metadata.JsonTypeInfo.GetConverter(Type type, Type parentClassType, MemberInfo memberInfo, Type& runtimeType, JsonSerializerOptions options)
   at System.Text.Json.Serialization.Metadata.JsonTypeInfo.AddProperty(MemberInfo memberInfo, Type memberType, Type parentClassType, Boolean isVirtual, Nullable`1 parentTypeNumberHandling, JsonSerializerOptions options)
   at System.Text.Json.Serialization.Metadata.JsonTypeInfo.CacheMember(Type declaringType, Type memberType, MemberInfo memberInfo, Boolean isVirtual, Nullable`1 typeNumberHandling, Boolean& propertyOrderSpecified, Dictionary`2& ignoredMembers)
   at System.Text.Json.Serialization.Metadata.JsonTypeInfo..ctor(Type type, JsonConverter converter, Type runtimeType, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializerOptions.<InitializeForReflectionSerializer>g__CreateJsonTypeInfo|112_0(Type type, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializerOptions.GetClassFromContextOrCreate(Type type)
   at System.Text.Json.JsonSerializerOptions.GetOrAddClass(Type type)
   at System.Text.Json.JsonSerializer.GetTypeInfo(JsonSerializerOptions options, Type runtimeType)
   at System.Text.Json.JsonSerializer.Serialize[TValue](TValue value, JsonSerializerOptions options)
   at <StartupCode$FSI_0003>.$FSI_0003.main@()
Stopped due to error
bartelink commented 2 years ago

I can't see anything blatantly wrong; note the memoize in the code I linked does mean that the reflection will only happen once per type.

One thing though: doing -c Release and/or otherwise turning on all optimizations is important for microbenchmarking

One other thing to look at might be [<Struct>] DUs - that'll remove some allocations as it will allow the values to be embedded in the object as enum values would be

The other thing is that for the DU cases, you could use Options without the JsonStringEnumConverter in the mix (but that won't win much as these sorts of things are cached in STJ)

deyanp commented 2 years ago

@bartelink, I did run with the stuff with sudo dotnet run -c Release (without sudo BencharkDotNet was complaing about sth).

I will re-run with [<Struct>] slapped on top of all DUs, thanks!

One question - I assume there is no other (centralized) way but having this attribute on all DUs, right?

[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU10>>)>]

Even the type must be specified (it seems that is in contrast to NewtonSoft.Json) ... I was hoping for something like this and without all these attributes:

serOptions.Converters.Add(JsonConverter(typeof<TypeSafeEnumConverter<_>>))

but I guess that is not possible?

bartelink commented 2 years ago

Re Debug stuff, I see

[Host] : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT DEBUG

One question - I assume there is no other (centralized) way but having this attribute on all DUs, right?

There's a factory in FsCodec that enables this via the FsCodec.SystemTextJson.Options ctor: https://github.com/jet/FsCodec/blob/master/src/FsCodec.SystemTextJson/Options.fs#L53-L60

Note that this also switches on the automatic application of UnionConverter too, which may not be to your taste - would accept a PR to have separate autoUnion and autoTypeSafeEnum options, with the latter only opting into Tye Safe Enums with no values

deyanp commented 2 years ago

@bartelink , there was actually an error in my benchmarks, I was comparing Enum Serilization to Enum Serialization ;)

"Correct" results (until proven otherwise):

Serialization - I see 10-50% overhead of DUs compared to Enums ... image

Deserialization - I see 5-10% overhead of DUs compared to Enums ... image

This [Host] : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT DEBUG I do not know how to eliminate, I am running sudo dotnet run -c Release ... I even put <Optimize>true</Optimize> tag in fsproj, but I don't know of anything else causing this Debug Host ...

For autoTypeSafeEnum I would need to split/duplicate the UnionOrTypeSafeEnumConverterFactory as well though - is this what you suggested?

bartelink commented 2 years ago

Ah; cool you found the discrepancy ;) I'm sure the perf can be improved, but I guess its not bad as it is.

or autoTypeSafeEnum I would need to split/duplicate the UnionOrTypeSafeEnumConverterFactory as well though - is this what you suggested?

See the issue I wrote about this (I originally assumed this issue was in the FsCodec repo when I responded to your first question; best to take this off to the side!)

(ASIDE: I'd love to know if there are any outline plans/designs for how any prospective support in .NET 7 might work out - I'd like to align with that if at all possible)

eiriktsarpalis commented 2 years ago

I'd love to know if there are any outline plans/designs for how any prospective support in .NET 7 might work out

No concrete plans for the moment (it's not clear we'll have the bandwidth to pull it off in time for 7). Ideally though it should be possible to build it on top of the infrastructure to be introduced by #63747.

eiriktsarpalis commented 2 years ago

I had initially hoped that it would be possible to implement support for unions using the infrastructure from polymorphism (#63747), however on closer inspection of F# union codegen, it turns out that this is not as simple as I had originally thought for a number of reasons:

Based on the above, there appear to be two possible approaches we could follow if we decide to implement support for unions in the future:

  1. Implement union serialization using a custom converter. This is how Json.NET does it, but notably this is expensive to implement in STJ if we want to support all STJ features like async serialization (a functionally correct implementation would need to replicate the implementation of the internal object converter). I don't think we should follow this approach, since it wouldn't be feasible to keep features in sync and fully tested.
  2. Use the contract customization model (proposed in #63686). It might be possible support union deserialization using a customized parameterized constructor delegate that instantiates the correct union case based on tag metadata. Prototyping is necessary to evaluate feasibility.
KenBonny commented 8 months ago

Is this still being investigated and worked on? As I'm quite interested in seeing STJ support F# DU's.

eiriktsarpalis commented 8 months ago

It is not being worked on currently. We will update this issue as soon as something changes.

YkTru commented 6 months ago

@eiriktsarpalis

1- Do you think it will ever be possible? I mean, should we keep hope or do all by hand?

2- While waiting: Would you recommend Wlaschin DTO approach?

2a- Else what would be your recommend, preferred approach in most cases?

Thank you

bartelink commented 6 months ago

@YkTru why do it by hand when there are two perfectly viable answers to 2a:

eiriktsarpalis commented 6 months ago

Using a third-party option is perfectly fine. One thing to note about any custom converter is that it necessarily loses the ability to do streaming serialization (only because streaming converters are internal for now).

In other words, attempting to serialize something large like

type MyUnion = | Values of int []

JsonSerializer.SerializeAsync(stream, Values [1 .. 1_000_000])

would necessarily result in the entire payload being buffered by the serializer. That shouldn't matter much as long as you're restricted to small-ish values.