mbraceproject / FsPickler

A fast multi-format message serializer for .NET
http://mbraceproject.github.io/FsPickler/
MIT License
324 stars 52 forks source link

Introduce CustomPickler imperatively? #39

Closed caindy closed 9 years ago

caindy commented 9 years ago

Introduction

I have learned a lot reading the codebase and a few relevant issues. In particular the admonishment that POCOs are unsupported by design (edit: custom pickling semantics could only be introduced by customizing the target type directly) and the assumed homogeneity of producers/consumers implies a rigidity to the serialization that make it unsuitable as vehicle for data persistence or heterogeneous interchange.

Nevertheless, there are two particular strengths of FsPickler that compel me to see if it might be extended to slightly different ends:

There are two situations that arise frequently:

type Customer = {
  Name : CustomerName
  Email : EmailAddress
}
and CustomerName = CustomerName of string
and EmailAddress = EmailAddress of string

Ideally these would make use of phantom types, but lacking that facility single-case discriminated unions work well enough.

type Widget = { Kind: Foo }
type Foo = Bar | Baz of Baz
type Baz = Zap | Zazz

Basically, this is modelling a hierarchical taxonomy, treating the DUs as true tags in pattern matching business rules. I also think the fact that units of measure (UoM) are type erased is relevant. I think integer coding would be a relevant use case as well.

I would like to be able to arbitrarily define "bottom" for a type. So e.g. in JSON: {"Name":"Don";"Email":"dsyme@..."} and {"Kind": "Baz Zap"}

Put another way, I need the option to encode some portion of the object graph using stable, private semantics. Moreover, I would like to opt-in to that regime only when creating a pickler for e.g. persistence or interchange, leaving the default global cache unchanged for node to node communication.

Discussion

FsPickler is a very sophisticated library, and I'm reticent to suggest any concrete changes, but please indulge this code example as a sketch:

let eP : Pickler<EmailAddress> = //Pickler.alt implementation?
let cP : Pickler<CustomerName> = //...
let storagePickler = FsPickler.generatePickler<Customer> [ eP; cP ]

I think #29 is very relevant, though with slightly different motivations.

In summary, I am under the impression that FsPickler is nearly ideal for implementing the backbone of a multi-targeted serialization strategy. I recognize this use is not its raison d'être, but its power and compositionality leave me optimistic that it could be put effectively to this use with relatively small changes.

eiriktsarpalis commented 9 years ago

Hi Christopher,

Let me just begin by clarifying that the assertion that POCOs are unsupported by design is inaccurate, perhaps I got my phrasing wrong in that other discussion. In that context, the correct statement is that inserting inserting custom pickling semantics does in fact rule out POCOs. The most common compromise in such cases is to implement types using either ISerializable or OnSerialized/OnDeserialized attributes. Both live in mscorlib (thus types using them could qualify as POCOs under certain definitions) and both allow user-specified logic to be inserted.

If I understand this correctly, you need to define a custom, local instance of a pickler for one of your types in a way that version tolerance is achieved?

In general, I would recommend using ISerializable for such cases: it is a pattern naturally suited for version tolerance. However, I recognise that this can only be used with classes and not algebraic types. So here's a proposal: it should be easy to implement the combinator

Pickler.fromSerializationInfo : (SerializationInfo -> 'T) -> (SerializationInfo -> 'T -> unit) -> Pickler<'T>

Then your example could be rendered like so:

let storagePickler = Pickler.ofSerializable (fun si -> { Name = si.GetValue<Name> "Name" ; Email = si.GetValue<Email> "Email" }
                                            (fun si customer -> si.AddValue(customer.Name, "Name") ; si.AddValue(customer.Email, "Email"))

The constructor/projector lambdas could be augmented with custom logic indicating what should happen in the event of missing fields. SerializationInfo comes with the added benefit that ordering of serialised fields is not important, a property not generally satisfied by most other picklers. It goes without saying that it comes with an added performance penalty, as this will box all fields that are values under the hood.

Keep in mind that the global cache can always leak and override serialisations even in cases of local picklers. Take this pathological example:

type Customer = {
  Name : CustomerName
  Email : EmailAddress
  Foo : obj
}

let c1 = { Name = .. ; Email = .. ; Foo = null }
let c2 = { Name = .. ; Email = .. ; Foo = box c1 }

Serializing c2 with a local pickler will still result in c1 being resolved form the global cache.

eiriktsarpalis commented 9 years ago

As of 44966ba16a0da8d8387011e9fc4a5fe47953a553 you can define the following

type Customer = {
  Name : CustomerName
  Email : EmailAddress
}
and CustomerName = CustomerName of string
and EmailAddress = EmailAddress of string

// Pickler.fromSerializationInfo : (SerializationInfo -> 'T) -> (SerializationInfo -> 'T -> unit) -> Pickler<'T>

let cP : Pickler<Customer> = 
    Pickler.fromSerializationInfo
        (fun sI -> 
            { 
                Name = defaultArg (sI.TryGetValue "Name") (CustomerName "unspecified") 
                Email = defaultArg (sI.TryGetValue "Email") (EmailAddress "unspecified")
            })
        (fun sI c -> sI.AddValue("Name", c.Name) ; sI.AddValue("Email", c.Email))
caindy commented 9 years ago

Eirik, please accept my apologies for unintentionally mischaracterizing the point about POCOs. I updated the original comment for posterity and the TL;DR crowd. You were clear in the linked comment; it was my wording that was misleading (and just plain wrong without the context of custom pickling semantics). Your point about what really constitutes a POCO is well taken, but for various reasons I greatly prefer to keep the pickling/serialization concern divorced from the type declarations on my current project.

With that response out of the way--wow--thank you for this enhancement and pushing the nuget package out so quickly. So far this is working very well. I have a couple more cases to work through yet, but I will re-open or open a new issue as appropriate should the need arise. Thanks again!

eiriktsarpalis commented 9 years ago

No apologies needed, thanks for taking the time to write all this feedback :-)