Open cvanama opened 4 years ago
Grouped doesn't really make sense here. Not at a PC to test, but "packed" is worth a try (V3 will use "packed" by default here). Also: perhaps try V3?
On Fri, 24 Apr 2020, 16:16 cvanama, notifications@github.com wrote:
Thanks for writing protobuf-net project, and we had one interesting observation while comparing ProtoBuf-Net and BinaryFormatter with Large size datasets.
Protobuf-net output payload bigger as well as took more time to serialize and deserialze large data set with heavy objects. When we run same dataset with simpleobject which you gave example in Github, then the ProtoBuf-net performed very well compare with Binary Serializer.
We are using .Net core with V2 latest Protobuf-net. Please suggest and help us further.
This below code for your reference:
Serializable Class :
using ProtoBuf; using System; using System.Collections.Generic; using System.Linq; using System.Text;
namespace SerializationTest { [ProtoContract] [Serializable] public class SerializableClassObject { [ProtoMember(1, DataFormat = DataFormat.Group)] public virtual double[] stream1 { get; set; } [ProtoMember(2, DataFormat = DataFormat.Group)] public virtual double[] stream2 { get; set; } [ProtoMember(3, DataFormat = DataFormat.Group )] public virtual double[] stream3 { get; set; } [ProtoMember(4, DataFormat = DataFormat.Group)] public virtual double[] stream4 { get; set; } [ProtoMember(5, DataFormat = DataFormat.Group)] public virtual double[] stream5 { get; set; } [ProtoMember(6, DataFormat = DataFormat.Group)] public virtual double[] stream6 { get; set; } [ProtoMember(7, DataFormat = DataFormat.Group)] public virtual double[] stream7 { get; set; } [ProtoMember(8, DataFormat = DataFormat.Group)] public virtual double[] stream8 { get; set; } [ProtoMember(9, DataFormat = DataFormat.Group)] public virtual double[] stream9 { get; set; }
[ProtoMember(10, DataFormat = DataFormat.Group)] public virtual double[] stream10 { get; set; } public SerializableClassObject() { }
}
}
Each stream in the object should have 18250 elements, hence load the object with the random data of each property with stream of array consists 18250 elements :
Random random = new Random(); double[] data = Enumerable.Repeat(0, 18250).Select(i => (double)random.Next(300000, 400000)).ToArray();
Created the object by cloning the above data for each stream
private static SerializableObject GetSerializableObject(double[] data) { return new SerializableObject { stream1 = (double[])data.Clone(), stream2 = (double[])data.Clone(), stream3 = (double[])data.Clone(), stream4 = (double[])data.Clone(), stream5 = (double[])data.Clone(), stream6 = (double[])data.Clone(), stream7 = (double[])data.Clone(), stream8 = (double[])data.Clone(), stream9 = (double[])data.Clone(), stream10 = (double[])data.Clone() }; }
We tried serialize/deserialize with different number of these objects , Our ultimate goal is to run 50000 objects of the above class with loaded data at once (not one by one object). we can leverage the code by running parllel batches. But we are trying to see which serialization is faster. To achieve below results, we used simple function and StopWatch to run this.
Please find the results below:
Object Count : 1000 ProtoBuf Formatter - Total Object Count 1000, PayLoadSize - 1.52 GB, BatchSize - 200, SerializationTime - 5051 MilliSec, DeserializationTime - 4214 MilliSec, DeserializedObject Count - 1000 Binary Formatter - Total Object Count 1000,PayLoadSize - 1.35 GB, BatchSize - 200, SerializationTime - 3746 MilliSec, DeserializationTime - 4547 MilliSec, DeserializedObject Count - 1000
Object Count : 2000 ProtoBuf Formatter - Total Object Count 2000, PayLoadSize - 3.05 GB,BatchSize - 200, SerializationTime - 8306 MilliSec, DeserializationTime
- 9111 MilliSec, DeserializedObject Count - 2000 Binary Formatter - Total Object Count 2000,PayLoadSize - 2.71 GB, BatchSize - 200, SerializationTime - 7408 MilliSec, DeserializationTime - 8902 MilliSec, DeserializedObject Count - 2000
Object Count : 5000 ProtoBuf Formatter - Total Object Count 5000, PayLoadSize - 7.64 GB, BatchSize - 200, SerializationTime - 19716 MilliSec, DeserializationTime - 23538 MilliSec, DeserializedObject Count - 5000 Binary Formatter - Total Object Count 5000,PayLoadSize - 6.79 GB, BatchSize - 200, SerializationTime - 17684 MilliSec, DeserializationTime - 25023 MilliSec, DeserializedObject Count - 5000
Object Count : 10000 ProtoBuf Formatter - Total Object Count 10000, PayLoadSize - 15.2 GB, BatchSize - 200, SerializationTime - 47375 MilliSec, DeserializationTime - 45104 MilliSec, DeserializedObject Count - 10000 Binary Formatter - Total Object Count 10000, PayLoadSize - 13.5 GB, BatchSize - 200, SerializationTime - 40094 MilliSec, DeserializationTime - 40460 MilliSec, DeserializedObject Count - 10000
Object Count : 20000 ProtoBuf Formatter Total Object Count 20000, PayLoadSize - 30.5 GB - BatchSize - 200, SerializationTime - 79700 MilliSec, DeserializationTime - 93909 MilliSec, DeserializedObject Count - 20000 Binary Formatter - Total Object Count 20000,PayLoadSize - 27.1 GB, BatchSize - 200, SerializationTime - 70612 MilliSec, DeserializationTime - 84737 MilliSec, DeserializedObject Count - 20000
Object Count : 25000 ProtoBuf Formatter - Total Object Count 25000, PayLoadSize - 38.2 GB, BatchSize - 200, SerializationTime - 103701 MilliSec, DeserializationTime - 121900 MilliSec, DeserializedObject Count - 25000 Binary Formatter - Total Object Count 25000, PayLoadSize - 33.9 GB, BatchSize - 200, SerializationTime - 92305 MilliSec, DeserializationTime - 107657 MilliSec, DeserializedObject Count - 25000
Object Count : 50000 ProtoBuf Formatter - Total Object Count 50000, PayLoadSize - 76.4 GB, BatchSize - 200, SerializationTime - 209763 MilliSec, DeserializationTime - 265031 MilliSec, DeserializedObject Count - 50000 Binary Formatter - Total Object Count 50000, PayLoadSize - 67.9 GB, BatchSize - 200, SerializationTime - 182645 MilliSec, DeserializationTime - 220315 MilliSec, DeserializedObject Count - 50000
With the simple data structure definitely ProtoBut-net is faster for serialization/deserialization. We are more interested to know, why these results are different for simple vs complex objects.
Thanks, Chandra
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/protobuf-net/protobuf-net/issues/632, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMBS7ZYETDTO4HO7CP3ROGUK7ANCNFSM4MQF2VXA .
When V3 will be officially released, we could not use current alpha version in our application. Please try to run with that object, so that you will observe the difference.
Thanks for writing protobuf-net project, and we had one interesting observation while comparing ProtoBuf-Net and BinaryFormatter with Large size datasets.
Protobuf-net output payload bigger as well as took more time to serialize and deserialze large data set with heavy objects. When we run same dataset with simpleobject which you gave example in Github, then the ProtoBuf-net performed very well compare with Binary Serializer.
We are using .Net core with V2 latest Protobuf-net. Please suggest and help us further.
This below code for your reference:
Serializable Class :
using ProtoBuf; using System; using System.Collections.Generic; using System.Linq; using System.Text;
namespace SerializationTest {
}
Each stream in the object should have 18250 elements, hence load the object with the random data of each property with stream of array consists 18250 elements :
Random random = new Random(); double[] data = Enumerable.Repeat(0, 18250).Select(i => (double)random.Next(300000, 400000)).ToArray();
Created the object by cloning the above data for each stream
We tried serialize/deserialize with different number of these objects , Our ultimate goal is to run 50000 objects of the above class with loaded data at once (not one by one object). we can leverage the code by running parllel batches. But we are trying to see which serialization is faster. To achieve below results, we used simple function and StopWatch to run this.
Please find the results below:
Object Count : 1000 ProtoBuf Formatter - Total Object Count 1000, PayLoadSize - 1.52 GB, BatchSize - 200, SerializationTime - 5051 MilliSec, DeserializationTime - 4214 MilliSec, DeserializedObject Count - 1000 Binary Formatter - Total Object Count 1000,PayLoadSize - 1.35 GB, BatchSize - 200, SerializationTime - 3746 MilliSec, DeserializationTime - 4547 MilliSec, DeserializedObject Count - 1000
Object Count : 2000 ProtoBuf Formatter - Total Object Count 2000, PayLoadSize - 3.05 GB,BatchSize - 200, SerializationTime - 8306 MilliSec, DeserializationTime - 9111 MilliSec, DeserializedObject Count - 2000 Binary Formatter - Total Object Count 2000,PayLoadSize - 2.71 GB, BatchSize - 200, SerializationTime - 7408 MilliSec, DeserializationTime - 8902 MilliSec, DeserializedObject Count - 2000
Object Count : 5000 ProtoBuf Formatter - Total Object Count 5000, PayLoadSize - 7.64 GB, BatchSize - 200, SerializationTime - 19716 MilliSec, DeserializationTime - 23538 MilliSec, DeserializedObject Count - 5000 Binary Formatter - Total Object Count 5000,PayLoadSize - 6.79 GB, BatchSize - 200, SerializationTime - 17684 MilliSec, DeserializationTime - 25023 MilliSec, DeserializedObject Count - 5000
Object Count : 10000 ProtoBuf Formatter - Total Object Count 10000, PayLoadSize - 15.2 GB, BatchSize - 200, SerializationTime - 47375 MilliSec, DeserializationTime - 45104 MilliSec, DeserializedObject Count - 10000 Binary Formatter - Total Object Count 10000, PayLoadSize - 13.5 GB, BatchSize - 200, SerializationTime - 40094 MilliSec, DeserializationTime - 40460 MilliSec, DeserializedObject Count - 10000
Object Count : 20000 ProtoBuf Formatter Total Object Count 20000, PayLoadSize - 30.5 GB - BatchSize - 200, SerializationTime - 79700 MilliSec, DeserializationTime - 93909 MilliSec, DeserializedObject Count - 20000 Binary Formatter - Total Object Count 20000,PayLoadSize - 27.1 GB, BatchSize - 200, SerializationTime - 70612 MilliSec, DeserializationTime - 84737 MilliSec, DeserializedObject Count - 20000
Object Count : 25000 ProtoBuf Formatter - Total Object Count 25000, PayLoadSize - 38.2 GB, BatchSize - 200, SerializationTime - 103701 MilliSec, DeserializationTime - 121900 MilliSec, DeserializedObject Count - 25000 Binary Formatter - Total Object Count 25000, PayLoadSize - 33.9 GB, BatchSize - 200, SerializationTime - 92305 MilliSec, DeserializationTime - 107657 MilliSec, DeserializedObject Count - 25000
Object Count : 50000 ProtoBuf Formatter - Total Object Count 50000, PayLoadSize - 76.4 GB, BatchSize - 200, SerializationTime - 209763 MilliSec, DeserializationTime - 265031 MilliSec, DeserializedObject Count - 50000 Binary Formatter - Total Object Count 50000, PayLoadSize - 67.9 GB, BatchSize - 200, SerializationTime - 182645 MilliSec, DeserializationTime - 220315 MilliSec, DeserializedObject Count - 50000
With the simple data structure definitely ProtoBut-net is faster for serialization/deserialization. We are more interested to know, why these results are different for simple vs complex objects.
Thanks, Chandra