ZiggyCreatures / FusionCache

FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.
MIT License
1.9k stars 97 forks source link

[FEATURE] Add Protobuf support #84

Closed jodydonetti closed 2 years ago

jodydonetti commented 2 years ago

Is your feature request related to a problem? Please describe. FusionCache provides a way to add custom serialization format support, by implementing the IFusionCacheSerializer interface.

It already provides 2 implementations, both for the JSON format:

It recently added (but not yet released) a third one: MessagePack.

It would be great to add support for the Protobuf serialization format, too.

Describe the solution you'd like A new package that add supports for the Protobuf format, probably based on protobuf-net by @mgravell which is probably the most used implementation on .NET and the most performant.

Describe alternatives you've considered Everyone that needs it should otherwise implement their own, which is meh 😐

mgravell commented 2 years ago

Probably looks a lot like this:

sealed class ProtobufNetSerializer : IFusionCacheSerializer
{
    private readonly TypeModel _model;
    public ProtobufNetSerializer(TypeModel model = null)
        => _model = model ?? RuntimeTypeModel.Default;

    public byte[] Serialize<T>(T obj)
    {
        using var ms = new MemoryStream();
        _model.Serialize<T>(ms, obj);
        return ms.ToArray();
    }

    public T Deserialize<T>(byte[] data)
        => _model.Deserialize<T>((ReadOnlyMemory<byte>)data);

    ValueTask<T> IFusionCacheSerializer.DeserializeAsync<T>(byte[] data)
        => new(Deserialize<T>(data));

    ValueTask<byte[]> IFusionCacheSerializer.SerializeAsync<T>(T obj) 
        => new(Serialize<T>(obj));
}

There is an API that can avoid having to use the MemoryStream / unknown-length thing, but right now: I'd go with ^^^

jodydonetti commented 2 years ago

Thanks @mgravell !

I was about to push my first impl and ask for your opinion, but you've beaten me to it 😄 From what I can see they are quite similar, at least in principle.

Here it is:

public class FusionCacheProtoBufNetSerializer
    : IFusionCacheSerializer
{
    public FusionCacheProtoBufNetSerializer(RuntimeTypeModel? model = null)
    {
        _model = model ?? RuntimeTypeModel.Default;

        // ENSURE MODEL REGISTRATION FOR FusionCacheEntryMetadata
        if (_model.IsDefined(typeof(FusionCacheEntryMetadata)) == false)
        {
            _model.Add(typeof(FusionCacheEntryMetadata), false)
                .SetSurrogate(typeof(FusionCacheEntryMetadataSurrogate))
            ;
        }
    }

    private readonly RuntimeTypeModel _model;
    private readonly object _modelLock = new object();

    private void EnsureDistributedEntryModelIsRegistered<T>()
    {
        // TODO: OPTIMIZE THIS

        var _t = typeof(T);
        if (_t.IsGenericType == false || _t.GetGenericTypeDefinition() != typeof(FusionCacheDistributedEntry<>))
            return;

        if (_model.IsDefined(_t))
            return;

        lock (_modelLock)
        {
            if (_model.IsDefined(_t))
                return;

            // ENSURE MODEL REGISTRATION FOR FusionCacheDistributedEntry<T>
            _model.Add(typeof(T), false)
                .Add(1, nameof(FusionCacheDistributedEntry<T>.Value))
                .Add(2, nameof(FusionCacheDistributedEntry<T>.Metadata))
            ;
        }
    }

    public byte[] Serialize<T>(T? obj)
    {
        EnsureDistributedEntryModelIsRegistered<T>();
        using (var stream = new MemoryStream())
        {
            _model.Serialize<T?>(stream, obj);
            return stream.ToArray();
        }
    }

    public T? Deserialize<T>(byte[] data)
    {
        if (data.Length == 0)
            return default(T);

        EnsureDistributedEntryModelIsRegistered<T>();
        using (var stream = new MemoryStream(data))
        {
            return _model.Deserialize<T?>(stream);
        }
    }

    public ValueTask<byte[]> SerializeAsync<T>(T? obj)
    {
        return new ValueTask<byte[]>(Serialize(obj));
    }

    public ValueTask<T?> DeserializeAsync<T>(byte[] data)
    {
        return new ValueTask<T?>(Deserialize<T>(data));
    }
}

A couple of notes:

For the last point this is a minimal repro:

string? value1 = null;
string? value2 = null;
byte[] data;

using (var ms = new MemoryStream())
{
    ProtoBuf.Serializer.Serialize(ms, value1);
    data = ms.ToArray();
}
using (var ms = new MemoryStream(data))
{
    value2 = ProtoBuf.Serializer.Deserialize<string>(ms);
}

Console.WriteLine($"VALUE 1 IS NULL: {value1 is null}");
Console.WriteLine($"VALUE 2 IS NULL: {value2 is null}");

And its output is:

VALUE 1 IS NULL: True
VALUE 2 IS NULL: False

Am I missing something?

Thanks!

jodydonetti commented 2 years ago

Mmmh, upon further investigation it seems that the call to IsDefined() is auto-adding the type to be checked, breaking the logic. I assume it may be related to one of the AutoAdd/AutoCreate options in the ModelType.

I'm trying to understand more, maybe using the CanSerialize() method, and will update later.

mgravell commented 2 years ago

Better option might be to hook the events on the model that are invoked when discovering a new type; I'm not at a PC to provide an example, but: Before/After something something!

On Fri, 21 Oct 2022, 16:59 Jody Donetti, @.***> wrote:

Mmmh, upon further investigation it seems that the call to IsDefined() is auto-adding the type to checked, braking the logic. I assume it may be related to one of the AutoAdd/AutoCrate options in the ModelType.

I'm trying to understand more, will update.

— Reply to this email directly, view it on GitHub https://github.com/ZiggyCreatures/FusionCache/issues/84#issuecomment-1287157087, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMDPQ4CXGJKUFCVO2J3WEK4XTANCNFSM6AAAAAARLARKYY . You are receiving this because you were mentioned.Message ID: @.***>

mgravell commented 2 years ago

Also, if it won't upset the other serializers: the config can be expressed via DataContract/DataMember(Order=...)

On Fri, 21 Oct 2022, 19:33 Marc Gravell, @.***> wrote:

Better option might be to hook the events on the model that are invoked when discovering a new type; I'm not at a PC to provide an example, but: Before/After something something!

On Fri, 21 Oct 2022, 16:59 Jody Donetti, @.***> wrote:

Mmmh, upon further investigation it seems that the call to IsDefined() is auto-adding the type to checked, braking the logic. I assume it may be related to one of the AutoAdd/AutoCrate options in the ModelType.

I'm trying to understand more, will update.

— Reply to this email directly, view it on GitHub https://github.com/ZiggyCreatures/FusionCache/issues/84#issuecomment-1287157087, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMDPQ4CXGJKUFCVO2J3WEK4XTANCNFSM6AAAAAARLARKYY . You are receiving this because you were mentioned.Message ID: @.***>

jodydonetti commented 2 years ago

Better option might be to hook the events on the model that are invoked when discovering a new type; I'm not at a PC to provide an example, but: Before/After something something!

Thanks for the suggestion, will definitely look into it!

jodydonetti commented 2 years ago

Also, if it won't upset the other serializers: the config can be expressed via DataContract/DataMember(Order=...)

Will definitely check this, too: If I remember correctly the 2 classes are already using those attributes (maybe with a Name, and not yet with an Order, but I can add that) but as the code suggests I'm also using a surrogate class for the metadata class, since that in turn is using a DateTimeOffset prop (and that is how I ended up on that other issue on the protobuf-net repo).

Anyway will play with the DataContract/DataMember approach too, thanks!

jodydonetti commented 2 years ago

Hi @mgravell , FYI I've been able to create a minimal repro about the null string issue I was observing, so I opened an issue in the protobuf-net repo. Hope this helps.

jodydonetti commented 2 years ago

Better option might be to hook the events on the model that are invoked when discovering a new type; I'm not at a PC to provide an example, but: Before/After something something!

Hi @mgravell , thanks for your help.

As per your suggestion I've looked into the before/after events to hook into registering a type just before serializing, but I've only found BeforeApplyDefaultBehaviour and AfterApplyDefaultBehaviour which don't seem to fire when trying to serialize an unregistered type.

Also as said, calls to both IsDefined() and CanSerialize() to check if a type is already registered seem to auto-add them, defeating the purpose of the check, and trying to add the same type multiple times without having checked throws an exception (which in and on itself makes sense).

The only way I've found is one of these 2: 1) keep an external list of already registered types, to keep track of the ones that still needs to be registered and avoid the exceptions 2) simply call Add() every time, but inside a try/catch block just to suppress the exception

Both of these are clearly not good solutions imho, so I'm still trying to come up with something better.

Initially I also played with the idea of using a small internal static class like internal static MyTypeCache<T> where I tried to use the static ctor (guaranteed to only be called once etc) and use that to be sure to register a type only once: the problem with such approach is that it would be in practice a singleton, and since the specific TypeModel to use is passed in my serializer ctor for better flexibility, I cannot do that, too.

Any idea?

jodydonetti commented 2 years ago

Update: I've pushed the branch, the serializer code can be found here.

jodydonetti commented 2 years ago

Update: I went with a mix of both options.

I added an internal cache (static/singleton) to keep track of each model's registered types, of course only the ones related to FusionCache, to auto-register them. I also added some locks (with the usual double check) to ease hypothetical concurrency issues that may arise and finally, when registering the types, I also included a try/catch block to avoid any issue with double registration (it should not happen, but I don't have exclusive control over what happens to a model that may be passed in the ctor.

All seems to be fine now: there are no more simply try/catch blocks that would throw every time, the locking seems fine (to me at least) and the little overhead with the internal models/types cache should be negligible, so I'm feeling good overall about the approach.

Will release it soon, and will notify it here too, just to keep track.

Thanks for your support, and if you happen to have any suggestion it would still be more than welcome.

jodydonetti commented 2 years ago

I've released v0.16.0 which includes support for Protobuf 🎉

Thanks again @mgravell for your support!