Tarmil / FSharp.SystemTextJson

System.Text.Json extensions for F# types
MIT License
325 stars 44 forks source link

Allow distinguishing JsonUnionEncoding.Untagged cases with some (but not total) overlap in field names #65

Closed rmunn closed 3 years ago

rmunn commented 3 years ago

Use case: I want to be able to serialize and deserialize the following union case:

[<JsonUnionEncoding.Untagged>]
type ListOperation =
    | AddOp of list: string * add: string list
    | RemoveOp of list: string * remove: string list

into JSON that looks like this:

{ list: "numbers", add: ["one", "two"] }
-or-
{ list: "numbers", remove: ["one"] }

Serialization looks like I'd expect it to look, but deserializing produces the error "Union ListOperation can't be deserialized as Untagged because it has duplicate field names across unions". The list name is shared between union cases, but the names add and remove are distinct, and should (I thought) be enough to distinguish between those two cases.

Is this a known limitation, or is what I'm trying to do something that should be possible?

P.S. That's a simplified representation of what I'm trying to do; my actual data structure is more complex, involving several other fields, but this is the minimal reproducible example.

rmunn commented 3 years ago

Note that it's the shape of the JSON that I'm most interested in. I've tried other ways of representing that JSON in F#, such as:

type ListOperation = {
    list : string
    op : ListOperationUnion
}

where ListOperationUnion is | add of string list | remove of string list. But this produces JSON that looks like:

{ list: "numbers", op: { add: ["one", "two"] } }

And I'd really like to not have the extra op field in my API's shape.

Tarmil commented 3 years ago

It is a known limitation. What happens is that System.Text.Json deserialization doesn't work on an AST but in a single pass on the input stream. So to be able to determine which case it is, the JSON being parsed must have a non-overlapping field as its first field. To be able to round-trip, we would need to make the serializer write fields out of order to ensure that the first one is non-overlapping. I decided against this added complexity and instead went for the simpler rule of "all fields must be have different names".

That being said, it could be amended to "the first field of each case must have different names" while keeping serialization simple. It would enable your use case, provided that you swap the order of fields in your union cases so that "add" and "remove" come first.

rmunn commented 3 years ago

That being said, it could be amended to "the first field of each case must have different names" while keeping serialization simple. It would enable your use case, provided that you swap the order of fields in your union cases so that "add" and "remove" come first.

It would be quite easy for me to swap the order of the union case fields, but if you're having to determine the union case based on the first field of the JSON, it's entirely possible that this would lead to a surprising failure case where { add: ["one", "two"], list: "numbers" } parsed successfully but { list: "numbers", add: ["one", "two"] } failed because the first field of the JSON was not enough to unambiguously resolve the case. The JSON spec says that "[a]n object is an unordered set of name/value pairs" (emphasis mine), so the fact that swapping the order of the fields causes it to fail would be a violation of the JSON spec. You can't count on the incoming JSON having been produced by the FSharp.SystemTextJson library, so although simply writing the fields in the correct order for deserialization would help round-tripping, it wouldn't be enough to guarantee that all valid JSON could be correctly deserialized.

So you'll have to stick with the "all fields must have different names" rule in order to guarantee that the first field you encounter will be sufficient to resolve the union case, and I might as well close this issue as there's no way to implement the feature I asked for given the single-pass nature of System.Text.Json. My fallback solution works rather well, anyway: I simply create a set of record types that each match a single possible API shape, and try deserializing each of those record types in turn, moving on to the next shape if deserializing throws a JSON encoding exception. It's slightly more work, but it works and gives me the API shape that I want. Thanks for your help, even if this feature turned out impossible to implement safely.