Code-Sharp / WampSharp

A C# implementation of WAMP (The Web Application Messaging Protocol)
http://wampsharp.net
Other
385 stars 83 forks source link

I want to use System.Text.Json and MessagePack-CSharp as Transport, is there a way? #316

Open dimohy opened 4 years ago

dimohy commented 4 years ago

Json.NET (Newtonsoft) was a .NET Json morning star, but now wants to pass Baton to System.Text.Json.

And although it is not correct, MessagePack for C# seems to be overwhelmingly faster than MsgPack-Cli. In particular, even with LZ4 compression, the speed and binary size of more than 3 times are very impressive. Probably keeping the UTF-8 direct usage and memory copy to a minimum is the key to speed.

performance

I want to use the WAMP protocol for the development of factory automation solutions. I am trying to develop in C# of .NET and I want to use WampSharp.

I want to use System.Text.Json and MessagePack for C# (MessagePack-CSharp) in WampSharp.

MessagePack for C# GitHub

If you are interested, have a development plan or need help, we would like to communicate.

darkl commented 4 years ago

Hi,

This is not the first time someone suggests me to support a different serialization library. This is not straightforward at all, and I will show some examples.

When we receive a WAMP message in either Json or MessagePack format, we receive an array serialized in the above format. The first element of this array is a number describing the message type of the received message. We have other parameters as well that give more details about the message.

Let me show an example: this is a fairly simple PUBLISH message:

[16,8,{"acknowledge":true},"com.myapp.topic1",[7],{}]`.

The router receiving this message needs to decode it. Here's how you can do it using Newtonsoft.Json (and essentially what WampSharp does, the sample is simplified for clarity):

[DataContract]
[Serializable]
public class PublishOptions
{
    [DataMember(Name = "acknowledge")]
    public bool? Acknowledge { get; set; }
}

static void Main(string[] args)
{
    string json = @"[16,8,{""acknowledge"":true},""com.myapp.topic1"",[7],{}]";

    JArray message = JArray.Parse(json);

    int messageCode = message[0].Value<int>();

    if (messageCode == 16)
    {
        // This is a PUBLISH message:
        int requestId = message[1].Value<int>();
        PublishOptions options = message[2].ToObject<PublishOptions>();
        string topicUri = message[3].Value<string>();

        JToken[] arguments = message[4].ToObject<JToken[]>();
        IDictionary<string, JToken> argumentKeywords = 
            message[5].ToObject<IDictionary<string, JToken>>();
    }
}

The last two lines are the most problematic, and I don't believe any serialization library except for Newtonsoft.Json knows how to deal with them well or at all. But even the code before that will be challenging enough to implement with the libraries you suggested. Go ahead and try! The starting point for the MessagePack serialization is

string base64 = @"lhAIgathY2tub3dsZWRnZcOwY29tLm15YXBwLnRvcGljMZEHgA==";
byte[] bytes = Convert.FromBase64String(base64);

Now that is not where the problem ends. I will explain why the last two lines above are the most problematic. The problem is that we also need support for partial dynamic deserialization. Suppose that you receive this message:

[36,226285541257267,429149520174977,{},[70,23],{"c":"Hello","d":{"counter":1,"foo":[1,2,3]}}]

You read 36 and see that it is an EVENT message. Then you run code similar to the code you ran before:

[DataContract]
[Serializable]
public class EventDetails
{
    [DataMember(Name = "publisher")]
    public long? Publisher { get; internal set; }
}

static void Main(string[] args)
{
    // string base64 = @"liTPAADNzjVN8DPPAAGGTyi0n4GQkkYXgqFjpUhlbGxvoWSCp2NvdW50ZXIBo2Zvb5MBAgM=";
    // byte[] bytes = Convert.FromBase64String(base64);
    string json =
        @"[36,226285541257267,429149520174977,{},[70,23],{""c"":""Hello"",""d"":{""counter"":1,""foo"":[1,2,3]}}]";

    JArray message = JArray.Parse(json);

    int messageCode = message[0].Value<int>();

    if (messageCode == 36)
    {
        // This is an EVENT message:
        long subscriptionId = message[1].Value<long>();
        long publicationId = message[2].Value<long>();
        EventDetails details = message[3].ToObject<EventDetails>();

        JToken[] arguments = message[4].ToObject<JToken[]>();
        IDictionary<string, JToken> argumentKeywords =
            message[5].ToObject<IDictionary<string, JToken>>();

        // Send parameters to event handler logic
        // ....
        // Much later in the code: we want to call the user's method OnTopic2
        // the following code is executed:
        int number1 = arguments[0].Value<int>();
        int number2 = arguments[1].Value<int>();
        string c = argumentKeywords["c"].Value<string>();
        MyClass d = argumentKeywords["d"].ToObject<MyClass>();
        // Call user's code!
        OnTopic2(number1, number2, c, d);
    }
}

public static void OnTopic2(int number1, int number2, string c, MyClass d)
{
    Console.WriteLine($@"Got event: number1:{number1}, number2:{number2}, c:""{c}"", d:{{{d}}}");
}

public class MyClass
{
    [JsonProperty("counter")]
    public int Counter { get; set; }

    [JsonProperty("foo")]
    public int[] Foo { get; set; }

    public override string ToString()
    {
        return $"counter: {Counter}, foo: [{string.Join(", ", Foo)}]";
    }
}

The point here is that we receive arguments of a type that is unknown to WampSharp, but only to the application consuming WampSharp. We need to carry the values arguments and argumentKeywords as JToken[] and IDictionary<string, JToken>, which allows us to convert them later on to the correct concrete types. This JToken type provided by Newtonsoft.Json allows us to perform this magic. I haven't seen any other framework that allows you to "deserialize twice" (or "partial deserialize") and does it well.

If you think that the libraries you suggest allow this, go ahead and implement these two code snippets using them and post it here. Otherwise, you might want to keep with Newtonsoft.Json. It does not have the best performance, but it is probably the most flexible in terms of customization.

Elad

dimohy commented 4 years ago

Thanks for the interesting answers. I have not yet analyzed the WampSharp source code. So maybe I still don't know exactly what the code that snippets gave me.

Nevertheless, we share the second snippet, converted to System.Text.Json.

    public static class JsonElementExtension
    {
        public static T ToObject<T>(this JsonElement element)
        {
            var json = element.GetRawText();
            return JsonSerializer.Deserialize<T>(json);
        }
        public static T ToObject<T>(this JsonDocument document)
        {
            var json = document.RootElement.GetRawText();
            return JsonSerializer.Deserialize<T>(json);
        }
    }

    class Program
    {
        [DataContract]
        [Serializable]
        public class EventDetails
        {
            [JsonPropertyName("publisher")]
            public long? Publisher { get; internal set; }
        }

        static void Main(string[] args)
        {
            // string base64 = @"liTPAADNzjVN8DPPAAGGTyi0n4GQkkYXgqFjpUhlbGxvoWSCp2NvdW50ZXIBo2Zvb5MBAgM=";
            // byte[] bytes = Convert.FromBase64String(base64);
            string json =
                @"[36,226285541257267,429149520174977,{},[70,23],{""c"":""Hello"",""d"":{""counter"":1,""foo"":[1,2,3]}}]";

            var message = JsonDocument.Parse(json);

            int messageCode = message.RootElement[0].GetInt32();
            if (messageCode == 36)
            {
                // This is an EVENT message:
                var subscriptionId = message.RootElement[1].GetUInt64();
                var publicationId = message.RootElement[2].GetUInt64();
                EventDetails details = message.RootElement[3].ToObject<EventDetails>();

                var arguments = message.RootElement[4];
                var argumentKeywords = message.RootElement[5].ToObject<IDictionary<string, JsonElement>>();

                //// Send parameters to event handler logic
                //// ....
                //// Much later in the code: we want to call the user's method OnTopic2
                //// the following code is executed:
                int number1 = arguments[0].GetInt32();
                int number2 = arguments[1].GetInt32();
                string c = argumentKeywords["c"].GetString();
                MyClass d = argumentKeywords["d"].ToObject<MyClass>();
                // Call user's code!
                OnTopic2(number1, number2, c, d);
            }
        }

        public static void OnTopic2(int number1, int number2, string c, MyClass d)
        {
            Console.WriteLine($@"Got event: number1:{number1}, number2:{number2}, c:""{c}"", d:{{{d}}}");
        }

        public class MyClass
        {
            [JsonPropertyName("counter")]
            public int Counter { get; set; }

            [JsonPropertyName("foo")]
            public int[] Foo { get; set; }

            public override string ToString()
            {
                return $"counter: {Counter}, foo: [{string.Join(", ", Foo)}]";
            }
        }
    }

MessagePack-CSharp seems to have to be a completely different form due to its performance. Still, from a conceptual point of view, arguments and argumentKeywords are the same abstract subset of root, so if you have an abstract structure, you can handle them almost identically.

darkl commented 4 years ago

From my understanding calling GetRawText is a killer in terms of performance (you allocate a new string for each subnode). In my Newtonsoft.Json example I load the document once into a tree structure (JObject) and then convert parts of this tree into relevant objects. In your example, you parse the document, and then each time you arrive to a subnode you create a string representing it and parse it again. I can't believe this will be better performance wise.

Elad

dimohy commented 4 years ago

Your answer feels like a hard rock. No library can be optimal under the conditions that must be compatible with Json.NET. This is because the standard is Json.NET. Also, the extension I quoted was not from a performance optimization perspective. It just makes it compatible. Nevertheless, there are other ways to reassign strings with similar uses to Json.NET. Also, if there is no need to maintain the same form, Utf8JsonReader can be used to quickly access arguments, etc. without reassignment through ReadOnlySpan <> to achieve the desired purpose. Conceptually, I think messagepack-csharp is the same. After all, is this a matter of wampsharp's structural acceptability?  

darkl commented 4 years ago

I think my first answer explains why we need to deserialize the message in parts. I think this will be hard to do with a forward only reader (which I don't know if Utf8JsonReader is). Please try to provide code examples that mimic the code I wrote, but without too much overhead (not serializing an subtree to string and then deserializing it for sure). If you show that it possile, it shouldn't be too difficult to implement WampSharp support for these serializers.

Elad

dimohy commented 4 years ago

Good. I understood roughly. I think I will almost use wampsharp. I will analyze the wampsharp source code for an accurate understanding. I don't want you to do what I asked for.

I understand that there was a little problem in communication because I am not English-speaking,

It was a bit hard, but I liked the conversation (or researcher style) that was interesting. I like conceptual thinking, so it's a little different, but I'll analyze wampsharp over time so I can give you interesting feedback.

darkl commented 4 years ago

This issue seems relevant.

darkl commented 4 years ago

Another issue is that System.Text.Json serialization doesn't support DataMember attributes.

dimohy commented 4 years ago

I also have a lot of experience in organizing my in-house library, so I know that consistency and backward compatibility are important for the library.

So I understand your point of view.

Let's change the perspective again and focus on MessagePack C# instead of System.Text.Json.

In my understanding, JSON format and MessagePack format differ only in packet format, and the representation is the same. One is text and the other is binary. Binary is implemented in the binary form of MessagePack. If the MessagePack library implements the MessagePack specification, they should all be compatible.

The part of interest is using a performance-oriented MessagePack. System.Text.Json addresses the need to view MessagePack format as text and use a new framework. So it's fine if System.Text.Json is difficult to apply directly to WampSharp. My real interest is in the use of performance oriented MessagePack.

However, there seems to be a problem here. WampSharp doesn't seem to process the MessagePack directly, it seems to convert it to JSON and process it via Newtonsoft Json. (I haven't actively analyzed the source code yet, and it will continue for months after the project starts.)

In my understanding, MessagePack should be interpreted without the help of JSON library. In particular, the WAMP protocol should be possible as it is relatively simple.

Why should I strongly combine MessagePack and JSON library?

I am not asking you to implement the features I need. I hope to be a productive conversation with me.

I am interested in performance and will be the heart of the factory automation solution. We look forward to being able to contribute to WampSharp after the project has progressed to some extent.

thanks

darkl commented 4 years ago

You could use a MessagePack library directly. WampSharp uses Newtonsoft.Msgpack in order to assure that serialization/deserialization behaves the same way regardless of the underlying format. Before I wrote Newtonsoft.Msgpack, I used msgpack-cli only. Unfortunately, there were some bugs and incompatibilities between the behaviors of Newtonsoft.Json and msgpack-cli (for instance, the way both libraries handled null values). At some point I decided to implement a JsonReader/JsonWriter for the MessagePack format. One thing this solves, is that I don't need to implement a custom MessagePackObject to JToken and vice-versa converter: think about the case that one client sends a message in MessagePack and you need to forward it to clients that use Json. Since all the current formatters convert everything to JToken, everything is compatible and no custom converters are needed.

dimohy commented 4 years ago

Good. Thanks for the answer. That was enough.

I'm using WampSharp for my project, so I hope to talk regularly. Thank you for your hard work and thanks for all your conversations with me.