Custom / manual serializing API?

aienabled commented 8 years ago

Hello!

I'm currently working on a network system for delta-synchronization of objects from application server to client. It packs every property modification into a message and sends it to client which is then applied it to fully replicate the state of the object on the server side.

Background. Every synchronizable object should implement BaseNetObject. The modifications are determined by injecting OnSyncPropertyChange(propertyName, value) call into each property marked with [SyncToClient] attribute, so when it's modified the method is called and modification data is generated. I'm using Roslyn (.NET Compiler platform) for injecting and compilation. Some synchronized properties could contain reference types (they're required to implement BaseNetObject) and they could be synchronized the same way. And there are also network collections types (List & Dictionary) inherited from BaseNetObject (so they synchronize every modification).

The question. Usually my system need to send full object, but sometimes it need to translate only its surrogate (because the object is already known to the client and it should use already created instance of it). For example:

public class State: BaseNetObject
{
  [SyncToClient]
  public SomeDeferredNetObject Prop1 { get; set; }

  [SyncToClient]
  public SomeDeferredNetObject Prop2 { get; set; }
}

public class SomeDeferredNetObject: BaseNetObject
{
  [SyncToClient]
  public SomeDeferredNetObject SomeProperty { get; set; }
}

public void Scenario1(State state)
{
   // this works ok using surrogates
   state.Prop2.SomeProperty = state.Prop1;
}

public void Scenario2(State state)
{
   var newObj = new SomeDeferredNetObject() { SomeProperty = state.Prop1 } ;
   // here the problem goes
   state.Prop2 = newObj;
}

The problem is shown in Scenario2() method:

I cannot use surrogates approach in that case, because it will serialize modification of Prop2 property by using surrogate (so it will only send NetObject ID, which will be unknown to client).
I also cannot use full-object approach, because it will serialize Prop2.SomeProperty by not using surrogate and it will be deserialized on client-side into the new object (when it should be "surrogated" to NetObject ID and client could use already known instance).

So, I want to be able to manually (by context?) determine during serialization when to use surrogate and when to use full object serialization. How I could accomplish this with AqlaSerializer?

Regards!

AqlaSolutions commented 8 years ago

@aienabled If I understand your issue correctly you may use *Specified properties to specify what needs to be serialized and on opposite side deserialize onto existing instance.

*Specified properties can be used also in surrogate so you can dynamically determine their values in surrogate class depending on your logic with attributes, etc without clogging your data classes.

aienabled commented 8 years ago

@AqlaSolutions do you mean ShouldSerialize* or *Specified properties? I know about this feature, but it doesn't help with my problem. Or at least I don't know how it could help...

I have a type SomeDeferredNetObject which could contain itself fields of the same type. I want to serialize instance of this type - will all its fields. But I want to be able at runtime decide if I want to serialize each of its fields as a surrogate or as actual object. This decision could be made by checking data which I set to SerializationContext prior to calling Serialize() method of a type model.

For example, I have two objects instances of the same type - A & B. A contains in one of its fields reference on B. When I'm serializing A, I want AqlaSerialize to serialize it without surrogate, but its field containing B I want to be serialized using surrogate.

So I want to be able to have some control over the serialization process. I know it will be totally incompatible with Protobuf protocol, but that's not important for me (this is .NET-only project).

Are there any callbacks/hooks/hacks or maybe I could write a custom stream writer/reader for this case?

Regards!

AqlaSolutions commented 8 years ago

@aienabled it's a complicated question and I'm quite busy now to think it through so don't expect the answer before Monday. Sorry for the delay.

aienabled commented 8 years ago

@AqlaSolutions, ok, thank you very much!

AqlaSolutions commented 8 years ago

@aienabled

1/ Surrogates sole purpose is to map "complicated" type to its simplified representation for serialization. They are not expected to be "enabled" or "disabled" at runtime and I'm not going to change it because they do what they should do by design and no more.

Scenario 2

2/ When you pass an existing instance to Deserialize method no new objects will be created except changed subtypes, immutables and collections or when an old value is null.

In your surrogate do not create SomeDeferredNetObject yourself. Instead make a serializable property of SomeDeferredNetObject type inside surrogate and let serializer handle it. Of course you may have another surrogate registered for SomeDeferredNetObject.

Also do not create a new object inside surrogate when an existing instance was passed. You may store the reference inside surrogate instance and let serializable properties access that existing instance. Afterwise, when you convert the value back from your surrogate, just return that stored reference.

3/ You want to serialize some properties on an object based on a condition (only when they are changed), right?.So ShouldSerialize* and *Specified are exactly what you need. You may use them either directly on your object or on its surrogate. Anyway you can just return true in *Specified if the property is changed or false if not.

aienabled commented 8 years ago

@AqlaSolutions, thanks for you response.

I'm not asking for changing surrogates approach, I just want to clarify that surrogates approach simply won't work in my case and I need custom/manual serialization API.

I do not pass any entity to Deserialize() method (I pass null) because I'm transferring only delta-updates, not full object graph itself, and deserialize only these small objects. This approach is required to handle complex cases when a deep nesting involved.

For example, state.Prop1.SomeProperty.SomeProperty.(...).SomeProperty was assigned to state.Prop2. In that case my system will generate very small delta-update:

network object ID (global) of modified object (instance of SomeDefererredNetObject);
reflection field ID - in that case it will be index of field SomeDeferredNetObject.SomeProperty;
serialized value of modified field (with AqlaSerializer) - surrogate will be used because state.Prop2 is known to the client.

To apply delta-update the application should locate network object by ID, deserialize the value and apply it via reflection to required field.

Currently it works perfectly well (even with very deep nesting), but I have one hard case described above in Scenario2. A new example:

public void Scenario3(State state)
{
   var newObj = new SomeDeferredNetObject() { SomeProperty = state.Prop1 } ;
   // let's assume that state.SomeProperty.SomeProperty is not null (assigned to some SomeDeferredNetObject) and we can assign its field SomeProperty
   state.SomeProperty.SomeProperty.SomeProperty  = newObj;
}

In that case I will create a small delta-update package which contains:

network ID of object currently stored at state.SomeProperty.SomeProperty (assume it's known to client at this point);
reflection field index for SomeDeferredNetObject.SomeProperty;
serialized newObj.

You see, I need to serialize newObj, and it has non-null property SomeProperty which should be serialized as surrogate (because it's already known to the client and has ID). But newObj itself should be serialized without the surrogate because it's new object which is unknown to client! The problem is that both newObj and SomeProperty are of the same type SomeDeferredNetObject. So surrogates approach doesn't work here :-(.

My idea (and the title of this issue) is that I can use some custom/manual serialization API in that case. For example, when serializing SomeDeferredNetObject I would like to have a method public void Serialize(SerializationContext context, SerializationStream stream) which allows to MANUALLY write any fields I need (with complex if-instructions. For example, if by checking SerializationContext object the server decide the object is known to client, it will simply write only a boolean flag isKnownObject and a network object ID). The same for deserialization - public static SomeDeferredNetObject Deserialize(SerializationContext context, SerializationStream stream) to manually read fields. During reading data it will understand if the object is known and find object instance by ID in the internal database. Or, if the object is not known, it will read all the fields manually.

To clarify, please have a look on this code:

public class SomeDeferredNetObject: BaseNetObject
{
  [SyncToClient]
  public SomeDeferredNetObject SomeProperty { get; set; }

  [AqlaCustomSerializer]
  public void Serialize(SerializationContext context, SerializationStream stream)
  {
      var myContext = (MyContext)context.Context;
      var isKnownObject = myContext.IsObjectKnownToClient(this);
      stream.Write(isKnownObject);
      stream.Write(this.Id);

      if (!isKnownObject)
      {
         // write all fields
         stream.Write(this.SomeProperty);
      }
  }

  [AqlaCustomDeserializer]
  public static SomeDeferredNetObject Deserialize(SerializationContext context, SerializationStream stream)
  {
      var myContext = (MyContext)context.Context;
      var isKnownObject = stream.Read<bool>();
      var objectId = stream.Read<uint>();
      if (isKnownObject)
      {       
          return (SomeDeferredNetObject)myContext.FindNetObject(objectId);
      }

      var result = new SomeDeferredNetObject() { Id = objectId };
      // register it so if the SomeProperty references on this object in its fields, it will correctly locate this instance
      myContext.RegisterNetObject(result);

      // read remaining fields
      result.SomeProperty = stream.Read<SomeDeferredNetObject>();

      return result;
  }
}

I'm also think this might be useful in some other cases as well, it will make AqlaSerializer much more agile. Of course it will be totally incompatible with protobuf.

Might you point me on an extension points in the source code (so I will create a fork) or maybe you will provide an API I've described in the example above? Or maybe you will propose a better idea?

Regards!

AqlaSolutions commented 8 years ago

@aienabled There are Serialize/Deserialize overloads which accept ProtoWriter/Reader. Before serializing/deserializing you may register your known objects into ProtoWriter/Reader.NetCache (it's internal + some changes will be required for root object handling so you need to fork). When writing/reading they will be already present in the reference-tracked cache so no real object data will be written to the stream. The wire format will be the same. You need to have referencing-tracking enabled for your fields for this to work.

aienabled commented 8 years ago

@AqlaSolutions, thanks, this is exactly what I need. However, I cannot find how I could create overloads accepting ProtoWriter/Reader. Might you give me a brief example please? Also, it would be best if I could separate serialization/deserialization methods and the class code, because the classes are user-defined and should not contain any other methods. Regards!

AqlaSolutions commented 8 years ago

@aienabled method signatures on RuntimeTypeModel:

public void Serialize(ProtoWriter dest, object value)
public object Deserialize(ProtoReader source, object value, System.Type type)

SerializationContext is passed to ProtoReader/Writer so its already included. See the source code of normal overloads for Serialize/Deserialize as an example how to use ProtoReader/Writer. If you need a length-prefixed version there will be a bit more code.

it would be best if I could separate serialization/deserialization methods and the class code, because the classes are user-defined and should not contain any other methods.

As long as you store known objects list separated from such class code it doesn't need to contain anything related to serialization.

aienabled commented 8 years ago

@AqlaSolutions, thanks, I will try it.

aienabled commented 8 years ago

@AqlaSolutions, so, you recommend to create instance of ProtoWriter manually and use public void Serialize(ProtoWriter dest, object value) to write the object? I understand how to use this for custom serialization of the root object. However, I still cannot understand how to use this to custom serialize some objects in objects graph.

What I need is ability to somehow hook into the serialization/deserialization process for objects of some specified types. It would be best if I could do something like that:

var metaType = this.Model.Add(type, applyDefaultBehaviourIfNew: false);
metaType.Callbacks.CustomSerialization = // assign a static method
metaType.Callbacks.CustomDeserialization = // assign another static method

The methods might have signatures like described above in the example code for SomeDeferredNetObject.

Regards!

AqlaSolutions commented 8 years ago

@aienabled oh, you really stuck with your approach and don't want to see any other options. I proposed you to delegate known objects tracking to the serializer. It already has reference-tracking mechanism so you just need to populate its internal objects cache with your known objects and then they will be treated as already encountered references.

You asked me about delta synchronization for known objects and I think that you don't need a "custom serialization" for this. It's much easier to reuse already existing mechanism.

Normally objects reference cache is empty on start but you need it to be populated with your known objects instead.

AqlaSolutions commented 8 years ago

@aienabled if you still want your approach you can modify SurrogateSerializer.cs to pass context as an argument for your surrogate converter method.

    [SerializableType]
    class Surrogate
    {
        [SerializableMember(1, ValueFormat.MinimalEnhancement)] // can be null
        public byte[] Data { get; set; }

        [SerializableMember(2)]
        public int ObjectId { get; set; }  // store known object id here

        [SurrogateConverter]
        public static Surrogate Convert(MyClass obj, SerializationContext context)
        { 
            var myContext = (MyContext)myContext;
            int id;
            if (myContext.IsObjectKnownToClient(obj, out id)) return new Surrogate { ObjectId = id };
            TypeModel model = context.OtherModel; // this should be another model without surrogate!
            using (var ms = new MemoryStream())
            {
                model.Serialize(ms, obj, context);
                return new Surrogate() { Data = ms.ToArray() };
            }
        }

        [SurrogateConverter]
        public static MyClass Convert(Surrogate s, SerializationContext context)
        {
            if (s.ObjectId > 0) return (MyContext)context.FindNetObject(s.ObjectId);
            TypeModel model = context.OtherModel; // this should be another model without surrogate!
            using (var ms = new MemoryStream(s.Data))
            {
                return (MyClass)model.Deserialize(ms, typeof(MyClass), context);
            }
        }
    }

If you do this without breaking 1-argument converters I could accept it as a pull request.

aienabled commented 8 years ago

@AqlaSolutions

you really stuck with your approach and don't want to see any other options. I proposed you to delegate known objects tracking to the serializer. It already has reference-tracking mechanism so you just need to populate its internal objects cache with your known objects and then they will be treated as already encountered references.

I understand now, thanks for detailed response! So I can manually add my objects into known objects store (NetCache of ProtoReader/Writer) before calling serialization/deserialization methods. That sounds good, however this way I will need to delegate all known objects each time I call serialization/deserialization methods, but only some of these known objects might be needed. So this maybe not very efficient in terms of performance, but definitely this is an elegant approach!

if you still want your approach you can modify SurrogateSerializer.cs to pass context as an argument for your surrogate converter method.

Yes, this should also work fine. Thanks for detailed example!

I will try to implement both approaches and benchmark CPU/memory usage to select the best one. If I will be able to properly modify SurrogateSerializer to pass a context into the surrogate converter methods and still keep compatibility with one-argument converters, I will create a pull request.

Regards!

aienabled commented 8 years ago

@AqlaSolutions, the surrogates approach is not works for me as it need to use another model for serialization of data (// this should be another model without surrogate!). For example:

var a = new NetTestClass();
a.SomeProperty = root.KnownObject; // this object is also NetTestClass
root.UnknownObject = a;

This way a.SomeProperty should be serialized as surrogate (it's a known object), but a itself should not be serialized as surrogate (it's a new object). But a and a.SomeProperty are both instances of the same class NetTestClass. So surrogates approach won't work in that case. ... Only if I could use the same typeModel and tell it "please serialize this instance now without the surrogate" - and it will do an exception of the rule and serialize the provided instance of NetTestClass without using any surrogates, but will continue using surrogates for other NetTestClass :-)...

I will try implementing NetCache solution now.

Regards!

AqlaSolutions commented 8 years ago

@aienabled right, if you need to handle nested object this way - you can't use surrogates approach.

btw another (but more complicated) solution would be implementing your custom decorator of TypeSerializer with your logic for known objects (wrapping can be done in MetaType.BuildSerializer).

aienabled commented 8 years ago

@AqlaSolutions, I've implemented approach with registering my own objects in NetObjectCache for ProtoWriter/ProtoReader. All my tests are green now! Thank you very much for your help. Regards!

AqlaSolutions / AqlaSerializer

Custom / manual serializing API? #8