serde-ml / serde

Serialization framework for OCaml
MIT License
174 stars 11 forks source link

Support ser/de of records with unordered fields #10

Closed leostera closed 8 months ago

leostera commented 8 months ago

At the moment when we try to deserialize a record we expect fields to appear in a specific order.

So if your record is:

type user = { id: int; name: string; }
[@@deriving serialize,deserialize]

Then the deserialize_user function (handwritten or derived) will only manage to deserialize data that is in this exact order:

  1. first the id
  2. then the name

This is perfectly fine for many formats, so we can't enforce arbitrary order of keys in records, but we want to support serializing and deserializing maps.

Maps are different from records in that they would allow us to do this. Maps here are also not referring to OCaml Map module. This is just a name we are using for unordered collections of key-value pairs.

Implementation Notes

I'd like to see a serialize_map and deserialize_map pair of functions added to the Serializer and Deserializer modules, with their corresponding helper functions for handwriting de/serializers and an API that is as close as possible to records.

This would allow the formats to decide whether they want to serialize fields in an ordered or an unordered way.

For serialization we can probably get something that is the exact same:

map ctx "user" keys @@ fun t ctx -> 
  let* () = field "id" (int32 t.id) in
  let* () = field "name" (string t.name) in
  ...

For deserialization we have to do some work, since the keys may come in any order. Here's how I'd expect the API to look like:

let field_visitor = Visitor.make
  ~visit_string:(fun _ctx str ->
  match str with
  | "id" -> Ok `Id
  | "name" -> Ok `Name
  | _ -> Error `invalid_tag)
in

map ctx @@ fun ctx ->
  let id = ref None in
  let name = ref None in
  let rec read_fields () =
    let* field = identifier ctx field_visitor in
    match field with
    | `Id -> 
        let* v = field "id" int32 in
        id := Some v;
        read_fields ()
    | `Name ->
        let* v = field "name" string in
        name := Some v;
        read_fields ()
  in
  let* () = read_fields () in
  let* id = id |> Option.to_result ~none:(Error (`Msg "missing field 'id'")) in
  let* name = name |> Option.to_result ~none:(Error (`Msg "missing field 'name'")) in
  Ok { id; name }
tjdevries commented 8 months ago

I think unordered should be the default, and forcing ordered should be the one with an override in the ppx. Do you think it should be the other way around?

I can probably work on this if you want btw, I literally cannot proceed without this it will be too painful to do twitch things without this haha

leostera commented 8 months ago

@tjdevries i've implemented some support for this, let me know if it solves your use-case.