Conerlius / protobuf-net

Automatically exported from code.google.com/p/protobuf-net
Other
0 stars 0 forks source link

User defined types taking more time to deserialize when using repeated elements #103

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I defined a message with some primitive types:

    message TypeA { 
       repeated string param1 = 1;
       repeated int64 param2 = 2;
       repeated bool param3 = 3;
       repeated bool param4 = 4;
    }

and a message that contain another message type:

    message ContainedType {
       required string param1 = 1;
       optional int64 param2 = 2;
       optional bool param3 = 3;
       optional bool param4 = 4;
    }
contained in:

    message TypeB {
       repeated ContainedType containedType = 1;
    }

I benchmarked the time it took to serialize and deserialize `TypeA` and 
`TypeB` with both having 100 elements in each list they contain. 
The serialized size of both types was same (2800 bytes) but there was alot 
of difference in serialization and deserialization times. Here are the 
results (JIT removed):

Iterations: 10000
TypeA serialization took: 810 millisec
TypeA deserialization took: 1131 millisec
TypeB serialization took: 1284 millisec
TypeB deserialization took: 8650 millisec

TypeB is taking alot of time to serialize/deserialize, however the 
serialization size is same. It seems like protobuf is using reflection 
every time for deserialization?

Original issue reported on code.google.com by ata.ma...@gmail.com on 12 Apr 2010 at 11:16

GoogleCodeExporter commented 8 years ago
P.S: don't hate me for the message and field names. I cannot think of any other 
dummy 
names for them.

Original comment by ata.ma...@gmail.com on 12 Apr 2010 at 11:18

GoogleCodeExporter commented 8 years ago
I imagine that is largely stream overheads. I've changed a lot of that in v2, 
and I 
would *expect* it to be much more comparable. I tried to verify this on v2, but 
it is 
blocked by some unrelated glitches (v2 is not released yet, and has a few 
kinks).

When I've debugged it (to see why it fails), I'll let you know whether this is 
already 
fixed in v2. If it is, I'm not sure it would be practical to back-port the fix 
to v1.

Original comment by marc.gravell on 12 Apr 2010 at 1:14

GoogleCodeExporter commented 8 years ago
(by "stream overheads", I mean the slightly ugly way that v1 handles nesting)

Original comment by marc.gravell on 12 Apr 2010 at 1:15

GoogleCodeExporter commented 8 years ago
Fixed the unrelated bug; here's the v2 stats - that do?

Runtime A/ser   107 μs/item
Runtime A/deser 100 μs/item
Runtime B/ser   709 μs/item
Runtime B/deser 711 μs/item
CompileInPlace A/ser    33 μs/item
CompileInPlace A/deser  39 μs/item
CompileInPlace B/ser    35 μs/item
CompileInPlace B/deser  54 μs/item
Compile A/ser   33 μs/item
Compile A/deser 39 μs/item
Compile B/ser   31 μs/item
Compile B/deser 42 μs/item

Original comment by marc.gravell on 12 Apr 2010 at 1:47

GoogleCodeExporter commented 8 years ago
Are these stats from the fixed version? There still seems quite some time 
difference in 
ser/deser.
When will V2 be out? Is is possible that this can be fixed for V1 (if there is 
not a 
lot of changes required).
If not then what would be work around for V1? Don't use nested types?

Original comment by ata.ma...@gmail.com on 13 Apr 2010 at 7:43

GoogleCodeExporter commented 8 years ago
Those stats are from the v2 trunk. There was an unrelated bugfix needed to get 
it 
working, but that is in the fixed in the trunk. The time for a v2 release could 
now 
be measured in weeks - maybe 3-4? Most (not quite all) of the core concepts are 
there 
and working, but there is still a lot of validation needed.

Applying this retrospectively to v1 would be very hard, but you could try using 
"groups" instead of nested types (see Google's language guide). Because 
"groups" 
don't need a length-prefix they go through a different route and should be 
faster. Of 
course, in most cases even the current difference is unlikely to cripple things.

Original comment by marc.gravell on 13 Apr 2010 at 7:57

GoogleCodeExporter commented 8 years ago

Original comment by marc.gravell on 22 Apr 2010 at 9:20

GoogleCodeExporter commented 8 years ago
I wanted to avoid using Groups because they are deprecated as per Google docs.
So I will just wait for V2. Any updates when its going to come out?

Original comment by ata.ma...@gmail.com on 26 Apr 2010 at 6:52

GoogleCodeExporter commented 8 years ago
Actually, Kenton (from Google) seems recently to be increasingly willing to 
expand 
group usage. Re v2 release; I would *hope* within a few weeks; the core is 
functionality stable, but I still have a lot of regression tests (mainly 
edge-cases) to 
get working.

Original comment by marc.gravell on 26 Apr 2010 at 8:04

GoogleCodeExporter commented 8 years ago

Original comment by marc.gravell on 13 Jun 2011 at 9:07