microsoft / bond

Bond is a cross-platform framework for working with schematized data. It supports cross-language de/serialization and powerful generic mechanisms for efficiently manipulating data. Bond is broadly used at Microsoft in high scale services.
MIT License
2.61k stars 322 forks source link

[C#] Make SerializerTransform, DeserializerTransform and SerializerGeneratorFactory public #970

Open LeroyK opened 5 years ago

LeroyK commented 5 years ago

I have created a custom polymorphic bond serializer and deserializer that uses SerializerTransform, DeserializerTransform and SerializerGeneratorFactory through reflection. This allows it to automatically use my custom serializer for all nested types, which would otherwise not be possible.

Even though this works fine via reflection, I would prefer these types to be available at compile time. Could we make these types public? I'd be happy to submit a PR for it.

chwarr commented 5 years ago

I don’t think these types should be made public as-is. These types weren’t designed to be part of a public, stable interface. Their API isn't as intentionally designed as public Bond APIs. They are tightly coupled to each other & other components. Since these are so fundamental to the nuts & bolts implementation of serialization/deserialization, I'd like to leave the flexibility to make breaking interface changes in these types as we maintain and improve Bond.

It would be useful to have more information about your custom serializer to understand how you're using the existing APIs and what the gap in them is. Perhaps there is a smaller extension point? Perhaps this is indicative of the need for a new family of APIs? For example, I've been mulling over what a C# API like the C++ Transform/Apply API would look like.

LeroyK commented 5 years ago

By default Bond uses its default implementation in its compiled expressions for deferred serialize/deserialize (e.g. to serialize bond fields). I want to be able to wrap these expressions with my custom logic to support polymorphism.

There are a couple of options I can think of to support this:

  1. The ISerializerGenerator<R, W> interface exists for custom serializers, but there is no way to get Bond's implementation of ISerializerGenerator<R, W>, because SerializerGeneratorFactory is also private. We could just make SerializerGeneratorFactory public. There is however no IDeserializerGenerator interface, but we could of course add it with a corresponding factory.
  2. Add deferred serialize/deserialize expression delegates, which Bond can use to wrap the original deferred serialize/deserialize expressions:
    public delegate Expression<Action<object, W>> DeferredSerialize(Type type, Expression<Action<object, W>> deferredSerialize);
    public delegate Expression<Func<R, object>> DeferredDeserialize(Type type, Expression<Func<R, object>> deferredDeserialize);
  3. Use a custmom IParser, however Bond's internal SerializerHelper & TwoPassSerializerHelper<FPW> take ObjectParser in their constructors (here and here) instead of IParser. This actually might be a bug and I think the parameter should be of type IParser. I haven't explored this further because I was never able to use a custom parser because of this.
chwarr commented 5 years ago

Thanks for the details. Can you also talk about the custom logic to support polymorphism? That will help get a fuller picture of what your thinking about.

LeroyK commented 5 years ago

Sure, currently I basically do the following:

  1. Add a discriminator enum to distinguish subtypes.
  2. Decorate each (sub)type with its corresponding enum value.
  3. Implement Bond.IBonded in each type that needs polymorphism.
  4. Use Bond.Expressions.PayloadBondedFactory to create a custom PolyBondedVoid<R>
  5. Use Bond.Factory to call my custom deserializer.
  6. To deserialize only the discriminator field and avoid an unnecessary allocation of the base type, I generate a dynamic assembly that contains a DiscriminatorStruct<TDiscriminatorEnum> for each Bond.Id (0..65535) that's being used.
  7. Before serialization it sets the discriminator property to the value corresponding to that subtype and and calls the subtype's serializer.

I wanted to use Bond.Expressions.ObjectBondedFactory as well, but it's bugged.

To make it as fast as possible, my code uses reflection to compile expression trees.

This small example will give you an idea what my model looks like:

[Bond.Schema]
public class Owner
{
    [Bond.Id(0)]
    [Bond.Type(typeof(List<bonded<Pet>>))]
    public List<Pet> Pets { get; set; }
}

[Bond.Schema]
[Discriminator(Discriminator.Pet)]
public class Pet : Bond.IBonded
{
    public enum Discriminator : byte
    {
        Pet = 0,
        Dog = 1,
        Parrot = 2
    }

    [Bond.Id(0)]
    public Discriminator TypeDiscriminator { get; set; }

    [Bond.Id(1)]
    public long Id { get; set; }

    void Bond.IBonded.Serialize<W>(W writer) => PolyBond.Serialize(this, writer);
    U Bond.IBonded.Deserialize<U>() => default;
    Bond.IBonded<U> Bond.IBonded.Convert<U>() => null;
}

[Bond.Schema]
[Discriminator(Discriminator.Dog)]
public class Dog : Pet
{
    [Bond.Id(0)]
    public bool Barks { get; set; }
}

[Bond.Schema]
[Discriminator(Discriminator.Parrot)]
public class Parrot : Pet
{
    [Bond.Id(0)]
    public bool Speaks { get; set; }
}